On Thu, Oct 18, 2012 at 5:19 PM, Scott McManus <scottmcma...@gatech.edu> wrote:
>
> Hey Peter-
>
> Thanks - I'll look into it. If you're able to reproduce the problem easily
> and wouldn't mind crafting a pull request, then it would be much
> appreciated. Otherwise I'll put this on my to-do list to be done soon.
> I or someone else may want to revisit the exception handling to prevent
> that from happening.
>
> Thanks!
>
> -Scott

OK then:
https://bitbucket.org/galaxy/galaxy-central/pull-request/78/avoid-stall-when-merging-zero-files-fao/diff

I can explain what was happening: We had a mount problem. The
Galaxy server could talk to SGE and submit jobs, but when the
jobs came to run the mount providing their home directory and
the Galaxy file system was down, so they failed. Naturally this
meant Galaxy got no output files back.

Reading the code, you deliberately attempt to merge any files
present (e.g. if 9 out of 10 come back). That does make sense
as it could be instructive (as long as it is flagged as an error,
which doesn't seem to be happening).

I think getting zero files back from the split-jobs ought to be an
error condition. In fact, failing to get all the expected sub-files
back should also be an error condition (although it is still nice
to do the merge so the user can see the partial output).

I think a little re-factoring might be needed to treat these
explicitly as errors.

Regards,

Peter
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to