It's set to 600 seconds so I don't think that is the issue... Is there some
sort of wait time to set in job_conf.xml
From: Evan Bollig [mailto:boll0...@umn.edu]
Sent: Thursday, June 12, 2014 9:27 AM
To: Shrum, Donald C
Subject: Re: [galaxy-dev] jobs submitted to a cluster
What is the value of keep_completed on your queue (from qmgr -c 'p s')? Could
it be that your spool is flushing completed jobs immediately? I ran into issues
the other day with libdrmaa requiring at least keep_complete = 60 seconds to
properly detect completed jobs and clean up after itself.
Research Associate | Application Developer | User Support Consultant Minnesota
599 Walter Library
612 624 1447
On Thu, Jun 12, 2014 at 7:36 AM, Shrum, Donald C <dcsh...@admin.fsu.edu> wrote:
> I've setup galaxy to submit jobs to our HPC cluster as the logged in user. I
> used the drama python module to submit the jobs to our moab server.
> It appears that the working directory for a submitted job is being removed by
> galaxy prior to the job completing on the cluster.
> I can see a working directory is created in the logs:
> galaxy.jobs DEBUG 2014-06-12 08:21:03,786 (15) Working directory for
> job is:
> I've confirmed the directory is created by watching on the file system and
> within about two seconds of the folder being created it is deleted.
> [root@admin 000]# watch -d ls -lR
> Every 2.0s: ls -lR
> Thu Jun 12 08:21:06
> total 64
> drwxrwxrwx 2 dcshrum dcshrum 4096 Jun 12 08:21 15
> I see the job sent via DRMAA:
> galaxy.jobs.handler DEBUG 2014-06-12 08:21:03,795 (15) Dispatching to
> drmaa runner galaxy.jobs.runners.drmaa DEBUG 2014-06-12 08:21:05,566
> (15) submitting file
> ry/000/15/galaxy_15.sh galaxy.jobs.runners.drmaa DEBUG 2014-06-12
> 08:21:05,566 (15) native specification is: -N galaxyjob -l
> nodes=1,walltime=2:00 -q genacc_q galaxy.jobs.runners.drmaa DEBUG
> 2014-06-12 08:21:05,892 (15) submitting with credentials: dcshrum
> [uid: 232706] galaxy.jobs.runners.drmaa INFO 2014-06-12 08:21:06,196
> (15) queued as 7570705.moab.local
> The job fails:
> galaxy.jobs.runners.drmaa DEBUG 2014-06-12 08:21:06,698
> (15/7570705.moab.local) state change: job finished, but failed
> galaxy.jobs.runners DEBUG 2014-06-12 08:21:07,124 (15/7570705.moab.local)
> Unable to cleanup
> [Errno 2] No such file or directory:
> I can see the same error in my moab log:
> *** error from copy
> /bin/cp: cannot create regular file
> ory/000/15/galaxy_15.o': No such file or directory
> *** end error output
> Any idea as to why galaxy removes the working directory? Is there a setting
> in the job_conf.xml that would resolve this?
> Thanks for any pointers.
> FSU Research Computing Center
> Please keep all replies on the list by using "reply all"
> in your mail client. To manage your subscriptions to this and other
> Galaxy lists, please use the interface at:
> To search Galaxy mailing lists use the unified search at:
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at: