Hey Peter-

Thanks - I'll look into it. If you're able to reproduce the problem easily
and wouldn't mind crafting a pull request, then it would be much 
appreciated. Otherwise I'll put this on my to-do list to be done soon.
I or someone else may want to revisit the exception handling to prevent
that from happening.

Thanks!

-Scott

----- Original Message -----
> Hi Scott,
> 
> Following some failing hard drives, I'm rebuilding our Galaxy server.
> Something isn't quite right with our cluster integration yet, but it
> has
> exposed a problem in Galaxy's handling of task splitting - it can
> sometimes attempt to merge zero files.
> 
> Here is my fix for the BLAST XML format (now in the ToolShed),
> https://bitbucket.org/peterjc/galaxy-central/changeset/5cb6411bad19802ba4001a083164366b42850a48
> 
> Here's an example using the text format:
> 
> galaxy.jobs.splitters.multi ERROR 2012-10-18 16:26:21,330 Error
> merging files
> Traceback (most recent call last):
>   File
>   "/mnt/galaxy/galaxy-central/lib/galaxy/jobs/splitters/multi.py",
> line 133, in do_merge
>     output_type.merge(output_files, output_file_name)
>   File "/mnt/galaxy/galaxy-central/lib/galaxy/datatypes/data.py",
>   line
> 545, in merge
>     raise Exception('Result %s from %s' % (result, cmd))
> Exception: Result 2 from cat  >
> /mnt/galaxy/galaxy-central/database/files/000/dataset_304.dat
> 
> The problem obviously is that while "cat file1 ... fileN > merged"
> will
> work fine for one or more files, with no files it sits waiting for
> stdin
> (and from a user perspective stalls).
> 
> This logic error is in lib/galaxy/datatypes/data.py method merge,
> which could either treat zero files as an error, or a no-op:
> 
>         if len(split_files) == 1:
>             cmd = 'mv -f %s %s' % ( split_files[0], output_file )
>         else:
>             cmd = 'cat %s > %s' % ( ' '.join(split_files),
>             output_file )
>         result = os.system(cmd)
> 
> I think this should be something like this:
> 
>         if not split_files:
>             raise Exception('Asked to merge zero files')
>         elif len(split_files) == 1:
>             cmd = 'mv -f %s %s' % ( split_files[0], output_file )
>         else:
>             cmd = 'cat %s > %s' % ( ' '.join(split_files),
>             output_file )
>         result = os.system(cmd)
> 
> It might also make sense to check for zero files in the code which
> calls the merge, i.e. lib/galaxy/jobs/splitters/multi.py function
> do_merge
> I'm still investigating upstream how this comes about, one clue:
> 
> galaxy.jobs.runners.drmaa DEBUG 2012-10-18 16:25:01,930 (273/510)
> state change: job is running
> galaxy.jobs.runners.drmaa DEBUG 2012-10-18 16:25:03,040 (273/510)
> state change: job finished, but failed
> galaxy.jobs.runners.drmaa DEBUG 2012-10-18 16:25:03,074 Job output
> not
> returned from cluster
> galaxy.jobs DEBUG 2012-10-18 16:25:03,074 task 641 for job 273 ended;
> exit code: 0
> galaxy.jobs DEBUG 2012-10-18 16:25:03,148 task 641 ended
> galaxy.jobs.runners.tasks DEBUG 2012-10-18 16:25:05,169 execution
> finished - beginning merge: tblastx -query
> "/mnt/galaxy/galaxy-central/database/files/000/dataset_127.dat"   -db
> "/var/local/blast/ncbi/nt" -query_gencode 2 -evalue 0.001 -out
> /mnt/galaxy/galaxy-central/database/files/000/dataset_304.dat
> -outfmt 0 -num_threads 8
> galaxy.jobs.splitters.multi DEBUG 2012-10-18 16:25:05,181 files []
> 
> If you would prefer that small suggestion as a pull request, let me
> know.
> 
> Regards,
> 
> Peter
> 
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to