I noticed the recent conversation about the "not-enough-memory" issue and
the suggestion that it may be caused by the latest drmaa library. I wonder
if my another problem is also caused by the same drmaa library:
We do not have (yet) many users for galaxy;I am still installing tools and
loading datasets. That's why there may be periods (of several hours) when
nobody is running any tool from the Galaxy UI. And when it finally happens
the job is submitted but never finishes. But it was not (as far as I can
see) submitted yet to the LSF - it is not known to the command "bjobs" at
all. The log just says:
- dispatching job 244 to drmaa runner
- job 244 dispatched
- (244) submitting file /home/galaxy/galaxy-dist/database/pbs/galaxy_244.sh
- (244) command is: seqret ...
But it does not continue by the usual:
- (244) queued as 222391
- (244/222391) state change: job is queued and active
It simply waits forever. However, and here is the interesting point, if, in
this moment, I submit, manually - not using Galaxy, any job to the LSF
(using the same LSF queue that is used by Galaxy) - e.g. something like
bsub -q galaxy -o $HOME/tmp/test.log "/usr/bin/env > $HOME/tmp/test.output"
this simple job is queued and done AND the other jobs, those started
previously from the Galaxy UI and so far waiting (somewhere), suddenly are
normally queues and executed. Strange, isn't it?
My workaround now is a bit silly: I have a cron job that runs every hour my
simple bsub command (as in the example above) - and I have no problem
starting jobs from Galaxy even after prolonged period of inactivity. But I
wonder if somebody noticed similar behaviour, or if it is worth to use drmaa
Thanks for any help and cheers,
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at: