Re: [galaxy-dev] Specifying a range of inputs?

2011-09-24 Thread Peter Cock
I realise we went off list accidentally.

On Saturday, September 24, 2011, Timothy Wu 2hug...@gmail.com wrote:
 On Sat, Sep 24, 2011 at 6:52 AM, Peter Cock p.j.a.c...@googlemail.com
wrote:

 Elegant maybe, but the UI side does not seem
 practical with the current Galaxy.

 Is there any future plan for a better multiple file support?

It has been discussed a little in other contexts,
Like running a workflow many times in parallel.


 I would write a tool to get the data from the NCBI into
 Galaxy as a single FASTA file in this situation. I'm
 assuming there is some rule to pick the particular
 set (e.g. An Entrez search, perhaps by taxonomy).

 Yeah, I think that's what I'll have to do. Thanks. :)

 Timothy

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Specifying a range of inputs?

2011-09-22 Thread Timothy Wu
Hi,

I'm implementing my own tools. In one tool I may have hundreds of input
files. (or maybe I'm doing it wrong?) I'm copying off the tool conf xml code
from the concatenate dataset tool. But this requires adding files one by
one. Is there a quicker way in which I can just specify these files from the
job number? Say I want data from job 42 to 323, something like that.

I haven't attempt to run the tool yet. I'm also curious if there is a limit
to how long the command can be at the shell level. I suspect the tool won't
be able to run especially with each of the galaxy files is designated in the
absolute path.

Timothy
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Specifying a range of inputs?

2011-09-22 Thread Timothy Wu
On Thu, Sep 22, 2011 at 4:36 PM, Peter Cock p.j.a.c...@googlemail.comwrote:

 How does your tool handle this at the command line (ignoring Galaxy)?
 Does it expect a directory name or pattern, or just a really long
 command line string with many many file names?


Originally I have this config text file which specify a directory. And
scripts will look into this directory for specific file name patterns. Since
galaxy specifies its own file names, the pattern would not work.

I'm actually tailoring my tools for galaxy because my original design is not
flexible and it's just not well thought out. With galaxy I'm pretty happy
that I get to split my tools up to be more fine-grained to attempt to stick
to the Unix tool's Write programs that do one thing and do it well
philosophy (well, more to the one part than to the well part).

I am thinking of a few work-arounds.

1. Assuming that there is only one user, I could have the user specifies the
first file, and than the number of files that would also be inputs, and I
can have the tool figure out the file paths from the path of the first, plus
the number increments.

2. For the tool prior to this, which generates these files (actually a FTP
download tool which downloads .tar.gz), I would have it also to unzip and
untar and than concat them.

3. For the tool prior to this, if there is anyway the tool would know which
file names it is writing to. (According to what I know, it does not, not
according to what's specified under Number of Output datasets cannot be
determined until tool run (
http://wiki.g2.bx.psu.edu/Admin/Tools/Multiple%20Output%20Files ), than it
can output a text file which list the paths of the file. The subsequent tool
can take this single file as input.

I don't like 1 since it requires that service is used as a single user
(otherwise the numbering could mess up).

I don't like 2, since it violates the principle of Unix tools. It doesn't
seem like its the design decision the Galaxy team would take. Furthermore, I
think unzipping is unnecessarily taking up disk space. My program just
parses directly off the gzip, but without unzipping I don't know how to
reasonably concat.

I like 3 best, but I do not seem to know the paths of the outputs since it's
Galaxy which is silently moving and renaming the files behind the scene. Any
suggestions?

Timothy
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/