Re: [galaxy-dev] AttributeError: type object 'InvalidJobException' has no attribute 'name'

2013-10-14 Thread Peter Cock
On Fri, Oct 11, 2013 at 9:12 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 On Thu, Oct 10, 2013 at 8:20 PM, Nate Coraor n...@bx.psu.edu wrote:
 Hi all,

 The recent changes to the DRMAA runner are for better handling
 of job-ending conditions under slurm, but it looks like SGE has
 different behavior when a job finishes.  I'll provide a fix for this
 shortly, in the meantime, it's fine to use a slightly older version
 of drmaa.py.

 --nate

 Thanks Nate,

 So far I've only seen this once so it isn't urgent for me.

 Peter

Hi Nate,

I see you've fix the name attribute error:
https://bitbucket.org/galaxy/galaxy-central/commits/ff76fd33b81cdde1fb270de688ec5e86488ba34d

However it seems the underlying problem (job check resulted in:
code 18: The job specified by the 'jobid' does not exist.) is affecting
other people now:
http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-October/017002.html

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] AttributeError: type object 'InvalidJobException' has no attribute 'name'

2013-10-14 Thread Nate Coraor
On Oct 14, 2013, at 6:07 AM, Peter Cock wrote:

 On Fri, Oct 11, 2013 at 9:12 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 On Thu, Oct 10, 2013 at 8:20 PM, Nate Coraor n...@bx.psu.edu wrote:
 Hi all,
 
 The recent changes to the DRMAA runner are for better handling
 of job-ending conditions under slurm, but it looks like SGE has
 different behavior when a job finishes.  I'll provide a fix for this
 shortly, in the meantime, it's fine to use a slightly older version
 of drmaa.py.
 
 --nate
 
 Thanks Nate,
 
 So far I've only seen this once so it isn't urgent for me.
 
 Peter
 
 Hi Nate,
 
 I see you've fix the name attribute error:
 https://bitbucket.org/galaxy/galaxy-central/commits/ff76fd33b81cdde1fb270de688ec5e86488ba34d
 
 However it seems the underlying problem (job check resulted in:
 code 18: The job specified by the 'jobid' does not exist.) is affecting
 other people now:
 http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-October/017002.html
 
 Peter


Hi Peter,

I'm working on refactoring the DRMAA runner to allow for these different DRM 
behaviors without duplicating the code.  In the interim, I've reverted the 
change to the DRMAA runner that resulted in the observed behavior in changeset 
d46b64f12c52.

Thanks,
--nate
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] AttributeError: type object 'InvalidJobException' has no attribute 'name'

2013-10-11 Thread Peter Cock
On Thu, Oct 10, 2013 at 8:20 PM, Nate Coraor n...@bx.psu.edu wrote:
 Hi all,

 The recent changes to the DRMAA runner are for better handling
 of job-ending conditions under slurm, but it looks like SGE has
 different behavior when a job finishes.  I'll provide a fix for this
 shortly, in the meantime, it's fine to use a slightly older version
 of drmaa.py.

 --nate

Thanks Nate,

So far I've only seen this once so it isn't urgent for me.

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] AttributeError: type object 'InvalidJobException' has no attribute 'name'

2013-10-10 Thread Peter Cock
On Tue, Oct 8, 2013 at 5:03 PM, Adhemar azn...@gmail.com wrote:
 Hi,
 After the last update I'm getting the following error.
 The job is submitted to SGE e executed, but galaxy doesn't get the result
 and keeps showing the job is executing (yellow box).
 Any clues?
 Thanks,
 Adhemar



 galaxy.jobs.runners ERROR 2013-10-08 13:01:18,488 Unhandled exception
 checking active jobs
 Traceback (most recent call last):
   File
 /opt/bioinformatics/share/galaxy20130410/lib/galaxy/jobs/runners/__init__.py,
 line 362, in monitor
 self.check_watched_items()
   File
 /opt/bioinformatics/share/galaxy20130410/lib/galaxy/jobs/runners/drmaa.py,
 line 217, in check_watched_items
 log.warning( (%s/%s) job check resulted in %s: %s, galaxy_id_tag,
 external_job_id, e.__class__.name, e )
 AttributeError: type object 'InvalidJobException' has no attribute 'name'


Same here, running galaxy-central with an SGE cluster (actually UGE
but the same DRMAA wrapper etc) when cancelling several jobs via
qdel at the command line:

Galaxy.jobs.runners ERROR 2013-10-10 15:16:35,731 Unhandled exception
checking active jobs
Traceback (most recent call last):
  File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/__init__.py,
line 362, in monitor
self.check_watched_items()
  File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py,
line 217, in check_watched_items
log.warning( (%s/%s) job check resulted in %s: %s,
galaxy_id_tag, external_job_id, e.__class__.name, e )
AttributeError: type object 'InvalidJobException' has no attribute 'name'

$ hg branch
default
[galaxy@ppserver galaxy-central]$ hg heads | more
changeset:   11871:c8b55344e779
tag: tip
user:Ross Lazarus ross.laza...@gmail.com
date:Tue Oct 08 16:30:54 2013 +1100
summary: Proper removal of rgenetics deprecated tool wrappers

changeset:   11818:1f0e7ae9e324
branch:  stable
parent:  11761:a477486bf18e
user:Daniel Blankenberg d...@bx.psu.edu
date:Sun Sep 29 16:04:31 2013 +1000
summary: Add additional check and slice to _sniffnfix_pg9_hex().
Fixes issue seen when attempting to view saved visualizations. Further
investigation may be needed.
...

Killing Galaxy and restarting didn't fix this, the errors persist.
I tried this fix to solve the attribute error in the logging call:

$ hg diff /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py
diff -r c8b55344e779 lib/galaxy/jobs/runners/drmaa.py
--- a/lib/galaxy/jobs/runners/drmaa.pyTue Oct 08 16:30:54 2013 +1100
+++ b/lib/galaxy/jobs/runners/drmaa.pyThu Oct 10 15:21:56 2013 +0100
@@ -214,7 +214,10 @@
 state = self.ds.jobStatus( external_job_id )
 # TODO: probably need to keep track of
InvalidJobException count and remove after it exceeds some
configurable
 except ( drmaa.DrmCommunicationException,
drmaa.InternalException, drmaa.InvalidJobException ), e:
-log.warning( (%s/%s) job check resulted in %s: %s,
galaxy_id_tag, external_job_id, e.__class__.name, e )
+if hasattr(e.__class__, name):
+log.warning( (%s/%s) job check resulted in %s:
%s, galaxy_id_tag, external_job_id, e.__class__.name, e )
+else:
+log.warning( (%s/%s) job check resulted in: %s,
galaxy_id_tag, external_job_id, e )
 new_watched.append( ajs )
 continue
 except Exception, e:


Now I get lots of these lines instead:

galaxy.jobs.runners.drmaa WARNING 2013-10-10 15:22:16,489 (251/11372)
job check resulted in: code 18: The job specified by the 'jobid' does
not exist.
galaxy.jobs.runners.drmaa WARNING 2013-10-10 15:22:16,533 (252/11373)
job check resulted in: code 18: The job specified by the 'jobid' does
not exist.
galaxy.jobs.runners.drmaa WARNING 2013-10-10 15:22:17,580 (253/11374)
job check resulted in: code 18: The job specified by the 'jobid' does
not exist.
galaxy.jobs.runners.drmaa WARNING 2013-10-10 15:22:17,624 (254/11375)
job check resulted in: code 18: The job specified by the 'jobid' does
not exist.
galaxy.jobs.runners.drmaa WARNING 2013-10-10 15:22:17,668 (255/11376)
job check resulted in: code 18: The job specified by the 'jobid' does
not exist.
galaxy.jobs.runners.drmaa WARNING 2013-10-10 15:22:17,712 (256/11377)
job check resulted in: code 18: The job specified by the 'jobid' does
not exist.
(this seems to repeat, endlessly)

I manually killed the jobs from the Galaxy history, and restarted
Galaxy again. That seemed to fix this.

If the DRMAA layer says the job was invalid (which is what I am
assuming InvalidJobException means) then surely it failed?
Perhaps something like this (untested)?

$ hg diff /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py
diff -r c8b55344e779 lib/galaxy/jobs/runners/drmaa.py
--- a/lib/galaxy/jobs/runners/drmaa.pyTue Oct 08 16:30:54 2013 +1100
+++ b/lib/galaxy/jobs/runners/drmaa.pyThu Oct 10 15:27:28 2013 +0100
@@ -213,10 +213,15 @@
 

Re: [galaxy-dev] AttributeError: type object 'InvalidJobException' has no attribute 'name'

2013-10-10 Thread Nate Coraor
Hi all,

The recent changes to the DRMAA runner are for better handling of job-ending 
conditions under slurm, but it looks like SGE has different behavior when a job 
finishes.  I'll provide a fix for this shortly, in the meantime, it's fine to 
use a slightly older version of drmaa.py.

--nate

On Oct 10, 2013, at 10:31 AM, Peter Cock wrote:

 On Tue, Oct 8, 2013 at 5:03 PM, Adhemar azn...@gmail.com wrote:
 Hi,
 After the last update I'm getting the following error.
 The job is submitted to SGE e executed, but galaxy doesn't get the result
 and keeps showing the job is executing (yellow box).
 Any clues?
 Thanks,
 Adhemar
 
 
 
 galaxy.jobs.runners ERROR 2013-10-08 13:01:18,488 Unhandled exception
 checking active jobs
 Traceback (most recent call last):
  File
 /opt/bioinformatics/share/galaxy20130410/lib/galaxy/jobs/runners/__init__.py,
 line 362, in monitor
self.check_watched_items()
  File
 /opt/bioinformatics/share/galaxy20130410/lib/galaxy/jobs/runners/drmaa.py,
 line 217, in check_watched_items
log.warning( (%s/%s) job check resulted in %s: %s, galaxy_id_tag,
 external_job_id, e.__class__.name, e )
 AttributeError: type object 'InvalidJobException' has no attribute 'name'
 
 
 Same here, running galaxy-central with an SGE cluster (actually UGE
 but the same DRMAA wrapper etc) when cancelling several jobs via
 qdel at the command line:
 
 Galaxy.jobs.runners ERROR 2013-10-10 15:16:35,731 Unhandled exception
 checking active jobs
 Traceback (most recent call last):
  File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/__init__.py,
 line 362, in monitor
self.check_watched_items()
  File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py,
 line 217, in check_watched_items
log.warning( (%s/%s) job check resulted in %s: %s,
 galaxy_id_tag, external_job_id, e.__class__.name, e )
 AttributeError: type object 'InvalidJobException' has no attribute 'name'
 
 $ hg branch
 default
 [galaxy@ppserver galaxy-central]$ hg heads | more
 changeset:   11871:c8b55344e779
 tag: tip
 user:Ross Lazarus ross.laza...@gmail.com
 date:Tue Oct 08 16:30:54 2013 +1100
 summary: Proper removal of rgenetics deprecated tool wrappers
 
 changeset:   11818:1f0e7ae9e324
 branch:  stable
 parent:  11761:a477486bf18e
 user:Daniel Blankenberg d...@bx.psu.edu
 date:Sun Sep 29 16:04:31 2013 +1000
 summary: Add additional check and slice to _sniffnfix_pg9_hex().
 Fixes issue seen when attempting to view saved visualizations. Further
 investigation may be needed.
 ...
 
 Killing Galaxy and restarting didn't fix this, the errors persist.
 I tried this fix to solve the attribute error in the logging call:
 
 $ hg diff /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py
 diff -r c8b55344e779 lib/galaxy/jobs/runners/drmaa.py
 --- a/lib/galaxy/jobs/runners/drmaa.pyTue Oct 08 16:30:54 2013 +1100
 +++ b/lib/galaxy/jobs/runners/drmaa.pyThu Oct 10 15:21:56 2013 +0100
 @@ -214,7 +214,10 @@
 state = self.ds.jobStatus( external_job_id )
 # TODO: probably need to keep track of
 InvalidJobException count and remove after it exceeds some
 configurable
 except ( drmaa.DrmCommunicationException,
 drmaa.InternalException, drmaa.InvalidJobException ), e:
 -log.warning( (%s/%s) job check resulted in %s: %s,
 galaxy_id_tag, external_job_id, e.__class__.name, e )
 +if hasattr(e.__class__, name):
 +log.warning( (%s/%s) job check resulted in %s:
 %s, galaxy_id_tag, external_job_id, e.__class__.name, e )
 +else:
 +log.warning( (%s/%s) job check resulted in: %s,
 galaxy_id_tag, external_job_id, e )
 new_watched.append( ajs )
 continue
 except Exception, e:
 
 
 Now I get lots of these lines instead:
 
 galaxy.jobs.runners.drmaa WARNING 2013-10-10 15:22:16,489 (251/11372)
 job check resulted in: code 18: The job specified by the 'jobid' does
 not exist.
 galaxy.jobs.runners.drmaa WARNING 2013-10-10 15:22:16,533 (252/11373)
 job check resulted in: code 18: The job specified by the 'jobid' does
 not exist.
 galaxy.jobs.runners.drmaa WARNING 2013-10-10 15:22:17,580 (253/11374)
 job check resulted in: code 18: The job specified by the 'jobid' does
 not exist.
 galaxy.jobs.runners.drmaa WARNING 2013-10-10 15:22:17,624 (254/11375)
 job check resulted in: code 18: The job specified by the 'jobid' does
 not exist.
 galaxy.jobs.runners.drmaa WARNING 2013-10-10 15:22:17,668 (255/11376)
 job check resulted in: code 18: The job specified by the 'jobid' does
 not exist.
 galaxy.jobs.runners.drmaa WARNING 2013-10-10 15:22:17,712 (256/11377)
 job check resulted in: code 18: The job specified by the 'jobid' does
 not exist.
 (this seems to repeat, endlessly)
 
 I manually killed the jobs from the Galaxy history, and restarted
 Galaxy again. That seemed to fix this.
 
 If the DRMAA layer says the job was invalid 

Re: [galaxy-dev] AttributeError: type object 'InvalidJobException' has no attribute 'name'

2013-10-08 Thread Adhemar
In order to test that, I've just downloaded a new galaxy-central and
configured it to submit jobs in our SGE cluster. Same problem. The job
starts and finishes, but galaxy keeps informing it's still running.
I've also attached the job_conf.xml file.
Need help, please!
-Adhemar


galaxy.jobs.runners ERROR 2013-10-08 15:29:16,721 Unhandled exception
checking active jobs
Traceback (most recent call last):
  File
/opt/bioinformatics/share/galaxy20131008/lib/galaxy/jobs/runners/__init__.py,
line 362, in monitor
self.check_watched_items()
  File
/opt/bioinformatics/share/galaxy20131008/lib/galaxy/jobs/runners/drmaa.py,
line 217, in check_watched_items
log.warning( (%s/%s) job check resulted in %s: %s, galaxy_id_tag,
external_job_id, e.__class__.name, e )
AttributeError: type object 'InvalidJobException' has no attribute 'name'



?xml version=1.0?
job_conf
plugins workers=10
!-- plugin id=local type=runner
load=galaxy.jobs.runners.local:LocalJobRunner workers=10/ --
plugin id=sge type=runner
load=galaxy.jobs.runners.drmaa:DRMAAJobRunner workers=10/
/plugins
handlers default=handlers
handler id=main/
handler id=handler0 tags=handlers/
handler id=handler1 tags=handlers/
handler id=handler2 tags=handlers/
handler id=handler3 tags=handlers/
handler id=handler4 tags=handlers/
handler id=handler5 tags=handlers/
handler id=handler6 tags=handlers/
handler id=handler7 tags=handlers/
handler id=handler8 tags=handlers/
handler id=handler9 tags=handlers/
/handlers
destinations default=sge_cluster
!--
destination id=local runner=local/
--
destination id=sge_cluster runner=sge tags=longjobs
param id=nativeSpecification-V -q galaxy.q/param
/destination
/destinations
/job_conf







2013/10/8 Adhemar azn...@gmail.com

 Hi,
 After the last update I'm getting the following error.
 The job is submitted to SGE e executed, but galaxy doesn't get the result
 and keeps showing the job is executing (yellow box).
 Any clues?
 Thanks,
 Adhemar



 galaxy.jobs.runners ERROR 2013-10-08 13:01:18,488 Unhandled exception
 checking active jobs
 Traceback (most recent call last):
   File
 /opt/bioinformatics/share/galaxy20130410/lib/galaxy/jobs/runners/__init__.py,
 line 362, in monitor
 self.check_watched_items()
   File
 /opt/bioinformatics/share/galaxy20130410/lib/galaxy/jobs/runners/drmaa.py,
 line 217, in check_watched_items
 log.warning( (%s/%s) job check resulted in %s: %s, galaxy_id_tag,
 external_job_id, e.__class__.name, e )
 AttributeError: type object 'InvalidJobException' has no attribute 'name'


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/