Hi List,
I'm having this same issue - if I dispatch a job to dynamic job runner, and it
then hands the job to pbs, then it will flip out if I restart the server. In my
case, though, there is no error in the job state - it is sitting in the Torque
queue where it should be. As far as I can tell, galaxy successfully reloads the
job back into the history. But it falters on recovering this destination for
pbs.
If I delete the queued jobs, it quiets down, but this is not a sustainable
solution - I need to be able to reboot the server without all of my users'
cluster jobs causing faults every second.
Log:
galaxy.jobs.runners.pbs DEBUG 2013-10-04 15:50:32,454 Set default PBS server to
m1.mason.indiana.edu
galaxy.jobs.runners DEBUG 2013-10-04 15:50:32,455 Starting 4 PBSRunner workers
galaxy.jobs DEBUG 2013-10-04 15:50:32,459 Loaded job runner
'galaxy.jobs.runners.pbs:PBSJobRunner' as 'pbs'
galaxy.jobs.handler DEBUG 2013-10-04 15:50:32,459 Loaded job runners plugins:
local:pbs
galaxy.jobs.handler INFO 2013-10-04 15:50:32,460 job handler stop queue started
galaxy.jobs DEBUG 2013-10-04 15:50:32,478 (514) Working directory for job is:
/N/dc/projects/galaxy/job_working_directory/000/514
galaxy.jobs.handler DEBUG 2013-10-04 15:50:32,478 recovering job 514 in pbs
runner
galaxy.jobs WARNING 2013-10-04 15:50:32,478 (514) Job runner URLs are
deprecated, use destinations instead.
galaxy.jobs.runners.pbs DEBUG 2013-10-04 15:50:32,479 (514/176938.m1.mason) is
still in PBS queued state, adding to the PBS queue
galaxy.jobs.handler INFO 2013-10-04 15:50:32,485 job handler queue started
...
galaxy.jobs.runners ERROR 2013-10-04 15:50:33,456 Unhandled exception checking
active jobs
Traceback (most recent call last):
File "/N/hd03/galaxy/Mason/galaxy-ncgas/lib/galaxy/jobs/runners/__init__.py",
line 362, in monitor
self.check_watched_items()
File "/N/hd03/galaxy/Mason/galaxy-ncgas/lib/galaxy/jobs/runners/pbs.py", line
385, in check_watched_items
( failures, statuses ) = self.check_all_jobs()
File "/N/hd03/galaxy/Mason/galaxy-ncgas/lib/galaxy/jobs/runners/pbs.py", line
466, in check_all_jobs
pbs_server_name =
self.__get_pbs_server(pbs_job_state.job_destination.params)
File "/N/hd03/galaxy/Mason/galaxy-ncgas/lib/galaxy/jobs/runners/pbs.py", line
222, in __get_pbs_server
return job_destination_params['destination'].split('@')[-1]
KeyError: 'destination'
Thanks so much for wisdom on this.
Sincerely,
Carrie Ganote
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/