Hi Scott, Following some failing hard drives, I'm rebuilding our Galaxy server. Something isn't quite right with our cluster integration yet, but it has exposed a problem in Galaxy's handling of task splitting - it can sometimes attempt to merge zero files.
Here is my fix for the BLAST XML format (now in the ToolShed), https://bitbucket.org/peterjc/galaxy-central/changeset/5cb6411bad19802ba4001a083164366b42850a48 Here's an example using the text format: galaxy.jobs.splitters.multi ERROR 2012-10-18 16:26:21,330 Error merging files Traceback (most recent call last): File "/mnt/galaxy/galaxy-central/lib/galaxy/jobs/splitters/multi.py", line 133, in do_merge output_type.merge(output_files, output_file_name) File "/mnt/galaxy/galaxy-central/lib/galaxy/datatypes/data.py", line 545, in merge raise Exception('Result %s from %s' % (result, cmd)) Exception: Result 2 from cat > /mnt/galaxy/galaxy-central/database/files/000/dataset_304.dat The problem obviously is that while "cat file1 ... fileN > merged" will work fine for one or more files, with no files it sits waiting for stdin (and from a user perspective stalls). This logic error is in lib/galaxy/datatypes/data.py method merge, which could either treat zero files as an error, or a no-op: if len(split_files) == 1: cmd = 'mv -f %s %s' % ( split_files[0], output_file ) else: cmd = 'cat %s > %s' % ( ' '.join(split_files), output_file ) result = os.system(cmd) I think this should be something like this: if not split_files: raise Exception('Asked to merge zero files') elif len(split_files) == 1: cmd = 'mv -f %s %s' % ( split_files[0], output_file ) else: cmd = 'cat %s > %s' % ( ' '.join(split_files), output_file ) result = os.system(cmd) It might also make sense to check for zero files in the code which calls the merge, i.e. lib/galaxy/jobs/splitters/multi.py function do_merge I'm still investigating upstream how this comes about, one clue: galaxy.jobs.runners.drmaa DEBUG 2012-10-18 16:25:01,930 (273/510) state change: job is running galaxy.jobs.runners.drmaa DEBUG 2012-10-18 16:25:03,040 (273/510) state change: job finished, but failed galaxy.jobs.runners.drmaa DEBUG 2012-10-18 16:25:03,074 Job output not returned from cluster galaxy.jobs DEBUG 2012-10-18 16:25:03,074 task 641 for job 273 ended; exit code: 0 galaxy.jobs DEBUG 2012-10-18 16:25:03,148 task 641 ended galaxy.jobs.runners.tasks DEBUG 2012-10-18 16:25:05,169 execution finished - beginning merge: tblastx -query "/mnt/galaxy/galaxy-central/database/files/000/dataset_127.dat" -db "/var/local/blast/ncbi/nt" -query_gencode 2 -evalue 0.001 -out /mnt/galaxy/galaxy-central/database/files/000/dataset_304.dat -outfmt 0 -num_threads 8 galaxy.jobs.splitters.multi DEBUG 2012-10-18 16:25:05,181 files [] If you would prefer that small suggestion as a pull request, let me know. Regards, Peter ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/