subject:"\[galaxy\-dev\] drmaa"

[galaxy-dev] DRMAA Slurm error

2014-09-25 Thread Pardo Diaz, Alfonso

Hi,


I have configured a new galaxy-project site with SLURM (version 14). I have one 
server with a Galaxy-Project instance, one node with SLURM server and two SLURM 
worker nodes. I have compile SLURM-DRMAA from source codes. When I run 
“drmaa-run /bin/hostname” it’s work. But, when I try to run the server I got 
the next error:

Traceback (most recent call last):
  File /home/galaxy-dist/lib/galaxy/webapps/galaxy/buildapp.py, line 39, in 
app_factory
app = UniverseApplication( global_conf = global_conf, **kwargs )
  File /home/galaxy-dist/lib/galaxy/app.py, line 141, in __init__
self.job_manager = manager.JobManager( self )
  File /home/galaxy-dist/lib/galaxy/jobs/manager.py, line 23, in __init__
self.job_handler = handler.JobHandler( app )
  File /home/galaxy-dist/lib/galaxy/jobs/handler.py, line 32, in __init__
self.dispatcher = DefaultJobDispatcher( app )
  File /home/galaxy-dist/lib/galaxy/jobs/handler.py, line 704, in __init__
self.job_runners = self.app.job_config.get_job_runner_plugins( 
self.app.config.server_name )
  File /home/galaxy-dist/lib/galaxy/jobs/__init__.py, line 621, in 
get_job_runner_plugins
rval[id] = runner_class( self.app, runner[ 'workers' ], **runner.get( 
'kwds', {} ) )
  File /home/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py, line 81, in 
__init__
self.ds.initialize()
  File /home/galaxy-dist/eggs/drmaa-0.7.6-py2.6.egg/drmaa/session.py, line 
257, in initialize
py_drmaa_init(contactString)
  File /home/galaxy-dist/eggs/drmaa-0.7.6-py2.6.egg/drmaa/wrappers.py, line 
73, in py_drmaa_init
return _lib.drmaa_init(contact, error_buffer, sizeof(error_buffer))
  File /home/galaxy-dist/eggs/drmaa-0.7.6-py2.6.egg/drmaa/errors.py, line 
151, in error_check
raise _ERRORS[code - 1](error_string)
AlreadyActiveSessionException: code 11: DRMAA session already exist.
[root@galaxy-project galaxy-dist]#


This is my “job_conf.xml”:

job_conf
plugins workers=4
plugin id=local type=runner 
load=galaxy.jobs.runners.local:LocalJobRunner/
plugin id=drmaa type=runner 
load=galaxy.jobs.runners.drmaa:DRMAAJobRunner/
plugin id=cli type=runner 
load=galaxy.jobs.runners.cli:ShellJobRunner /
plugin id=slurm type=runner 
load=galaxy.jobs.runners.slurm:SlurmJobRunner
param id=drmaa_library_path/usr/local/lib/libdrmaa.so/param
/plugin
/plugins
handlers
handler id=main/
/handlers
destinations default=drmaa_slurm
destination id=local runner=local/
destination id=multicore_local runner=local
  param id=local_slots4/param
  param id=embed_metadata_in_jobTrue/param
  job_metrics /
/destination
destination id=docker_local runner=local
  param id=docker_enabledtrue/param
/destination
destination id=drmaa_slurm runner=drmaa
param 
id=galaxy_external_runjob_scriptscripts/drmaa_external_runner.py/param
param 
id=galaxy_external_killjob_scriptscripts/drmaa_external_killer.py/param
param 
id=galaxy_external_chown_scriptscripts/external_chown_script.py/param
/destination
destination id=direct_slurm runner=slurm
param id=nativeSpecification--time=00:01:00/param
/destination
/destinations
resources default=default
  group id=default/group
  group id=memoryonlymemory/group
  group id=allprocessors,memory,time,project/group
/resources
tools
tool id=foo handler=trackster_handler
param id=sourcetrackster/param
/tool
tool id=bar destination=dynamic/
tool id=longbar destination=dynamic resources=all /
tool id=baz handler=special_handlers destination=bigmem/
/tools
limits
limit type=registered_user_concurrent_jobs2/limit
limit type=anonymous_user_concurrent_jobs1/limit
limit type=destination_user_concurrent_jobs id=local1/limit
limit type=destination_user_concurrent_jobs tag=mycluster2/limit
limit type=destination_user_concurrent_jobs tag=longjobs1/limit
limit type=destination_total_concurrent_jobs id=local16/limit
limit type=destination_total_concurrent_jobs 
tag=longjobs100/limit
limit type=walltime24:00:00/limit
limit type=output_size10GB/limit
/limits
/job_conf


Can you help me? I am newbie with “Galaxy Project” administration.




THANKS IN ADVANCE





Alfonso Pardo Diaz
System Administrator / Researcher
c/ Sola nº 1; 10200 Trujillo, ESPAÑA
Tel: +34 927 65 93 17 Fax: +34 927 32 32 37

[CETA-Ciemat logo]http://www.ceta-ciemat.es/



Confidencialidad: 
Este mensaje y sus ficheros adjuntos se dirige exclusivamente a su destinatario 
y puede contener información privilegiada o confidencial. Si no es vd. el 
destinatario indicado, queda notificado de que la utilización, divulgación y/o 
copia sin autorización está prohibida en virtud de la legislación vigente. Si 
ha

Re: [galaxy-dev] DRMAA Slurm error

2014-09-25 Thread Pardo Diaz, Alfonso

Solved!


The problem was I have configured the job_conf.xml two plugins entry with 
DRMAA. I have deleted the entry:

 plugin id=slurm type=runner 
load=galaxy.jobs.runners.slurm:SlurmJobRunner

And now works!


Thanks



El 25/09/2014, a las 08:12, Pardo Diaz, Alfonso 
alfonso.pa...@ciemat.esmailto:alfonso.pa...@ciemat.es escribió:

Hi,


I have configured a new galaxy-project site with SLURM (version 14). I have one 
server with a Galaxy-Project instance, one node with SLURM server and two SLURM 
worker nodes. I have compile SLURM-DRMAA from source codes. When I run 
“drmaa-run /bin/hostname” it’s work. But, when I try to run the server I got 
the next error:

Traceback (most recent call last):
  File /home/galaxy-dist/lib/galaxy/webapps/galaxy/buildapp.py, line 39, in 
app_factory
app = UniverseApplication( global_conf = global_conf, **kwargs )
  File /home/galaxy-dist/lib/galaxy/app.py, line 141, in __init__
self.job_manager = manager.JobManager( self )
  File /home/galaxy-dist/lib/galaxy/jobs/manager.py, line 23, in __init__
self.job_handler = handler.JobHandler( app )
  File /home/galaxy-dist/lib/galaxy/jobs/handler.py, line 32, in __init__
self.dispatcher = DefaultJobDispatcher( app )
  File /home/galaxy-dist/lib/galaxy/jobs/handler.py, line 704, in __init__
self.job_runners = self.app.job_config.get_job_runner_plugins( 
self.app.config.server_name )
  File /home/galaxy-dist/lib/galaxy/jobs/__init__.py, line 621, in 
get_job_runner_plugins
rval[id] = runner_class( self.app, runner[ 'workers' ], **runner.get( 
'kwds', {} ) )
  File /home/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py, line 81, in 
__init__
self.ds.initialize()
  File /home/galaxy-dist/eggs/drmaa-0.7.6-py2.6.egg/drmaa/session.py, line 
257, in initialize
py_drmaa_init(contactString)
  File /home/galaxy-dist/eggs/drmaa-0.7.6-py2.6.egg/drmaa/wrappers.py, line 
73, in py_drmaa_init
return _lib.drmaa_init(contact, error_buffer, sizeof(error_buffer))
  File /home/galaxy-dist/eggs/drmaa-0.7.6-py2.6.egg/drmaa/errors.py, line 
151, in error_check
raise _ERRORS[code - 1](error_string)
AlreadyActiveSessionException: code 11: DRMAA session already exist.
[root@galaxy-project galaxy-dist]#


This is my “job_conf.xml”:

job_conf
plugins workers=4
plugin id=local type=runner 
load=galaxy.jobs.runners.local:LocalJobRunner/
plugin id=drmaa type=runner 
load=galaxy.jobs.runners.drmaa:DRMAAJobRunner/
plugin id=cli type=runner 
load=galaxy.jobs.runners.cli:ShellJobRunner /
plugin id=slurm type=runner 
load=galaxy.jobs.runners.slurm:SlurmJobRunner
param id=drmaa_library_path/usr/local/lib/libdrmaa.so/param
/plugin
/plugins
handlers
handler id=main/
/handlers
destinations default=drmaa_slurm
destination id=local runner=local/
destination id=multicore_local runner=local
  param id=local_slots4/param
  param id=embed_metadata_in_jobTrue/param
  job_metrics /
/destination
destination id=docker_local runner=local
  param id=docker_enabledtrue/param
/destination
destination id=drmaa_slurm runner=drmaa
param 
id=galaxy_external_runjob_scriptscripts/drmaa_external_runner.py/param
param 
id=galaxy_external_killjob_scriptscripts/drmaa_external_killer.py/param
param 
id=galaxy_external_chown_scriptscripts/external_chown_script.py/param
/destination
destination id=direct_slurm runner=slurm
param id=nativeSpecification--time=00:01:00/param
/destination
/destinations
resources default=default
  group id=default/group
  group id=memoryonlymemory/group
  group id=allprocessors,memory,time,project/group
/resources
tools
tool id=foo handler=trackster_handler
param id=sourcetrackster/param
/tool
tool id=bar destination=dynamic/
tool id=longbar destination=dynamic resources=all /
tool id=baz handler=special_handlers destination=bigmem/
/tools
limits
limit type=registered_user_concurrent_jobs2/limit
limit type=anonymous_user_concurrent_jobs1/limit
limit type=destination_user_concurrent_jobs id=local1/limit
limit type=destination_user_concurrent_jobs tag=mycluster2/limit
limit type=destination_user_concurrent_jobs tag=longjobs1/limit
limit type=destination_total_concurrent_jobs id=local16/limit
limit type=destination_total_concurrent_jobs 
tag=longjobs100/limit
limit type=walltime24:00:00/limit
limit type=output_size10GB/limit
/limits
/job_conf


Can you help me? I am newbie with “Galaxy Project” administration.




THANKS IN ADVANCE





Alfonso Pardo Diaz
System Administrator / Researcher
c/ Sola nº 1; 10200 Trujillo, ESPAÑA
Tel: +34 927 65 93 17 Fax: +34 927 32 32 37

[CETA-Ciemat logo]http://www.ceta-ciemat.es/

[galaxy-dev] drmaa

2014-05-06 Thread Shrum, Donald C

Hi all,

I've configured galaxy with the drama python module.  This is really more of a 
drama question...

We have no default queue set in moab and I can't seem to find a way to specify 
a queue in the docs I've been looking at here - 
http://drmaa-python.readthedocs.org/en/latest/tutorials.html 

I'd like to be able to specify a queue based on various pieces of logic in my 
destinations.py script.

Any suggestions would be appreciated.

Donny
FSU Research Computing Center


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] drmaa

2014-05-06 Thread Nate Coraor

Hi Donny,

You should be able to specify the queue using the nativeSpecification field
of drmaa requests, e.g. in your job_conf.xml:

destination id=batch runner=pbs_drmaa
param id=nativeSpecification-q batch/param
/destination

Documentation on job_conf.xml's syntax by runner can be found here:

https://wiki.galaxyproject.org/Admin/Config/Performance/Cluster

--nate


On Tue, May 6, 2014 at 5:53 PM, Shrum, Donald C dcsh...@admin.fsu.eduwrote:

 Hi all,

 I've configured galaxy with the drama python module.  This is really more
 of a drama question...

 We have no default queue set in moab and I can't seem to find a way to
 specify a queue in the docs I've been looking at here -
 http://drmaa-python.readthedocs.org/en/latest/tutorials.html

 I'd like to be able to specify a queue based on various pieces of logic in
 my destinations.py script.

 Any suggestions would be appreciated.

 Donny
 FSU Research Computing Center


 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] DRMAA configuring issue

2014-04-25 Thread John Chilton

I feel like someone should respond to this but I must admin I don't
have a lot of ideas.

I assume you are able to use qsub to submit jobs from the Galaxy
server? This is worth verifying that before anything else. If that
doesn't work - the system configuration needs to be modified.

I think there are a couple different implementations of DRMAA for PBS:

http://apps.man.poznan.pl/trac/pbs-drmaa (I think this is recommend one).
http://sourceforge.net/projects/pbspro-drmaa/

It might be worth trying to compile the latest and great of one or
both and target both.

Galaxy also has a PBS runner that many people use for communicating
with Torque. I think the DRMAA runner should work - but this is a
fallback option as well just to get going.

-John
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] DRMAA configuring issue

2014-04-22 Thread Hakeem Almabrazi

Does anyone have any tips about this, please :)?

Regards

From: galaxy-dev-boun...@lists.bx.psu.edu 
[mailto:galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of Hakeem Almabrazi
Sent: Monday, April 21, 2014 3:49 PM
To: galaxy-dev@lists.bx.psu.edu
Subject: [galaxy-dev] DRMAA configuring issue

Hi,

I am trying to get DRMAA runner working for my local galaxy cluster.  However, 
I am having hard time configuring it my system.

So far,

I have installed Torque 2.5.12 and it seems to work as expected.

I installed drmaa_1.0.17 and here is DRMAA_LIBRARY_PATH
galaxy_env)galaxy@GalaxyTest01[/home/galaxy/galaxy-dist]$ echo 
$DRMAA_LIBRARY_PATH
/usr/local/lib/libdrmaa.so

My job.conf.xml
?xml version=1.0?
!-- A sample job config that explicitly configures job running the way it is 
configured by default (if there is no explicit config). --
job_conf
   plugins
 plugin id=sge type=runner 
load=galaxy.jobs.runners.drmaa:DRMAAJobRunner workers=4/
   /plugins
   handlers default=handlers
handler id=main tags=handlers/
   /handlers
   destinations default=sge_default
  destination id=sge_default runner=drmaa/
   /destinations
/job_conf

This is the error I am getting when I start galaxy.

galaxy.jobs INFO 2014-04-21 15:37:30,730 Handler 'main' will load all 
configured runner plugins
Traceback (most recent call last):
  File /home/galaxy/galaxy-dist/lib/galaxy/webapps/galaxy/buildapp.py, line 
39, in app_factory
app = UniverseApplication( global_conf = global_conf, **kwargs )
  File /home/galaxy/galaxy-dist/lib/galaxy/app.py, line 130, in __init__
self.job_manager = manager.JobManager( self )
  File /home/galaxy/galaxy-dist/lib/galaxy/jobs/manager.py, line 31, in 
__init__
self.job_handler = handler.JobHandler( app )
  File /home/galaxy/galaxy-dist/lib/galaxy/jobs/handler.py, line 30, in 
__init__
self.dispatcher = DefaultJobDispatcher( app )
  File /home/galaxy/galaxy-dist/lib/galaxy/jobs/handler.py, line 568, in 
__init__
self.job_runners = self.app.job_config.get_job_runner_plugins( 
self.app.config.server_name )
  File /home/galaxy/galaxy-dist/lib/galaxy/jobs/__init__.py, line 489, in 
get_job_runner_plugins
rval[id] = runner_class( self.app, runner[ 'workers' ], **runner.get( 
'kwds', {} ) )
  File /home/galaxy/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py, line 81, in 
__init__
self.ds.initialize()
  File /home/galaxy/galaxy-dist/eggs/drmaa-0.6-py2.6.egg/drmaa/__init__.py, 
line 274, in initialize
_w.init(contactString)
  File /home/galaxy/galaxy-dist/eggs/drmaa-0.6-py2.6.egg/drmaa/wrappers.py, 
line 59, in init
return _lib.drmaa_init(contact, error_buffer, sizeof(error_buffer))
  File /home/galaxy/galaxy-dist/eggs/drmaa-0.6-py2.6.egg/drmaa/errors.py, 
line 90, in error_check
raise _ERRORS[code-1](code %s: %s % (code, error_buffer.value))
DrmCommunicationException: code 2: (null)
Removing PID file paster.pid

I am not sure what is the issue here or how to go about resolving it.  I will 
really appreciate it if someone can tell me how to debug it?

Best regards

Hak

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] DRMAA configuring issue

2014-04-21 Thread Hakeem Almabrazi

Hi,

I am trying to get DRMAA runner working for my local galaxy cluster.  However, 
I am having hard time configuring it my system.


So far,

I have installed Torque 2.5.12 and it seems to work as expected.

I installed drmaa_1.0.17 and here is DRMAA_LIBRARY_PATH
galaxy_env)galaxy@GalaxyTest01[/home/galaxy/galaxy-dist]$ echo 
$DRMAA_LIBRARY_PATH
/usr/local/lib/libdrmaa.so

My job.conf.xml
?xml version=1.0?
!-- A sample job config that explicitly configures job running the way it is 
configured by default (if there is no explicit config). --
job_conf
   plugins
 plugin id=sge type=runner 
load=galaxy.jobs.runners.drmaa:DRMAAJobRunner workers=4/
   /plugins
   handlers default=handlers
handler id=main tags=handlers/
   /handlers
   destinations default=sge_default
  destination id=sge_default runner=drmaa/
   /destinations
/job_conf

This is the error I am getting when I start galaxy.

galaxy.jobs INFO 2014-04-21 15:37:30,730 Handler 'main' will load all 
configured runner plugins
Traceback (most recent call last):
  File /home/galaxy/galaxy-dist/lib/galaxy/webapps/galaxy/buildapp.py, line 
39, in app_factory
app = UniverseApplication( global_conf = global_conf, **kwargs )
  File /home/galaxy/galaxy-dist/lib/galaxy/app.py, line 130, in __init__
self.job_manager = manager.JobManager( self )
  File /home/galaxy/galaxy-dist/lib/galaxy/jobs/manager.py, line 31, in 
__init__
self.job_handler = handler.JobHandler( app )
  File /home/galaxy/galaxy-dist/lib/galaxy/jobs/handler.py, line 30, in 
__init__
self.dispatcher = DefaultJobDispatcher( app )
  File /home/galaxy/galaxy-dist/lib/galaxy/jobs/handler.py, line 568, in 
__init__
self.job_runners = self.app.job_config.get_job_runner_plugins( 
self.app.config.server_name )
  File /home/galaxy/galaxy-dist/lib/galaxy/jobs/__init__.py, line 489, in 
get_job_runner_plugins
rval[id] = runner_class( self.app, runner[ 'workers' ], **runner.get( 
'kwds', {} ) )
  File /home/galaxy/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py, line 81, in 
__init__
self.ds.initialize()
  File /home/galaxy/galaxy-dist/eggs/drmaa-0.6-py2.6.egg/drmaa/__init__.py, 
line 274, in initialize
_w.init(contactString)
  File /home/galaxy/galaxy-dist/eggs/drmaa-0.6-py2.6.egg/drmaa/wrappers.py, 
line 59, in init
return _lib.drmaa_init(contact, error_buffer, sizeof(error_buffer))
  File /home/galaxy/galaxy-dist/eggs/drmaa-0.6-py2.6.egg/drmaa/errors.py, 
line 90, in error_check
raise _ERRORS[code-1](code %s: %s % (code, error_buffer.value))
DrmCommunicationException: code 2: (null)
Removing PID file paster.pid

I am not sure what is the issue here or how to go about resolving it.  I will 
really appreciate it if someone can tell me how to debug it?

Best regards

Hak

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] DRMAA bug in galaxy-central

2014-01-17 Thread Bjoern Gruening

Hi,

since a few days there are frequent error messages popping up in my
Galaxy log files.

galaxy.jobs.runners ERROR 2014-01-17 13:11:17,094 Unhandled exception
checking active jobs
Traceback (most recent call last):
  File
/usr/local/galaxy/galaxy-dist/lib/galaxy/jobs/runners/__init__.py,
line 358, in monitor
self.check_watched_items()
  File /usr/local/galaxy/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py,
line 238, in check_watched_items
if self.runner_params[ retry_param ]  0:
TypeError: 'RunnerParams' object has no attribute '__getitem__'

Cheers,
Bjoern


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] DRMAA bug in galaxy-central

2014-01-17 Thread John Chilton

It is hard to test error states... but I assume you have the setup for
it :). Any chance you can apply these patches and let me know if they
fix the problem? I assume they will.

https://bitbucket.org/galaxy/galaxy-central/pull-request/300/potential-drmaa-fixes

-John

On Fri, Jan 17, 2014 at 6:15 AM, Bjoern Gruening
bjoern.gruen...@gmail.com wrote:
 Hi,

 since a few days there are frequent error messages popping up in my
 Galaxy log files.

 galaxy.jobs.runners ERROR 2014-01-17 13:11:17,094 Unhandled exception
 checking active jobs
 Traceback (most recent call last):
   File
 /usr/local/galaxy/galaxy-dist/lib/galaxy/jobs/runners/__init__.py,
 line 358, in monitor
 self.check_watched_items()
   File /usr/local/galaxy/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py,
 line 238, in check_watched_items
 if self.runner_params[ retry_param ]  0:
 TypeError: 'RunnerParams' object has no attribute '__getitem__'

 Cheers,
 Bjoern


 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] DRMAA bug in galaxy-central

2014-01-17 Thread Björn Grüning

Thanks John,

that fixed it for me!
Have a nice weekend,
Bjoern

 It is hard to test error states... but I assume you have the setup for
 it :). Any chance you can apply these patches and let me know if they
 fix the problem? I assume they will.
 
 https://bitbucket.org/galaxy/galaxy-central/pull-request/300/potential-drmaa-fixes
 
 -John
 
 On Fri, Jan 17, 2014 at 6:15 AM, Bjoern Gruening
 bjoern.gruen...@gmail.com wrote:
  Hi,
 
  since a few days there are frequent error messages popping up in my
  Galaxy log files.
 
  galaxy.jobs.runners ERROR 2014-01-17 13:11:17,094 Unhandled exception
  checking active jobs
  Traceback (most recent call last):
File
  /usr/local/galaxy/galaxy-dist/lib/galaxy/jobs/runners/__init__.py,
  line 358, in monitor
  self.check_watched_items()
File /usr/local/galaxy/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py,
  line 238, in check_watched_items
  if self.runner_params[ retry_param ]  0:
  TypeError: 'RunnerParams' object has no attribute '__getitem__'
 
  Cheers,
  Bjoern
 
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/
 
  To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] DRMAA/SGE job handling regression?

2013-11-11 Thread Peter Cock

Hello all,

On our main Galaxy tracking galaxy-dist using DRMAA/SGE,
jobs submitted to the cluster and queued and waiting (qw)
are correctly shown in Galaxy as grey pending entries in
the history.

With my test instance tracking galaxy-central (along with a
new visual look and new icons), such jobs are wrongly
shown as yellow (running).

Is this a general regression affecting other people?

There also seem to be issues where killing a job in Galaxy
just hides it but it remains running (yellow once you tick
show deleted datasets, and running on SGE too). This
was working properly on galaxy-dist (the job was killed
on the cluster, and shown as red if you ticked show
deleted datasets).

Thanks,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] DRMAA Runner URL Specify TORQUE host

2013-07-16 Thread Bart Gottschalk

I've had success using the pbs runner rather than drmaa runner for this
case.  It's quite straightforward to specify the pbs_server for the pbs
runner.  Works just as the documentation indicates.

- Bart


On Tue, Jul 9, 2013 at 1:46 PM, Bart Gottschalk bgott...@umn.edu wrote:

 I haven't been able to find a way to make the drmaa runner work in this
 situation.  I'm going to move on to trying this with a pbs runner instead.
 I will post to this thread if this works for me.

 - Bart

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] DRMAA Runner URL Specify TORQUE host

2013-07-09 Thread Bart Gottschalk

I haven't been able to find a way to make the drmaa runner work in this
situation.  I'm going to move on to trying this with a pbs runner instead.
I will post to this thread if this works for me.

- Bart
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] DRMAA Runner URL Specify TORQUE host

2013-06-26 Thread Bart Gottschalk

Is it possible to specify the torque host as part of a DRMAA runner URL?  I
haven't been able to find a *native_options *parameter to allow for this.
 I'm using the old style cluster configuration.

*drmaa://[native_options]/*
*
*
Also, I haven't been able to find a list of native_options anywhere.  Does
anyone have a link to a comprehensive list?

- Bart
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] DRMAA Runner URL Specify TORQUE host

2013-06-26 Thread Bart Gottschalk

Is it possible to specify the torque host as part of a DRMAA runner URL?  I
haven't been able to find a *native_options *parameter to allow for this.
 I'm using the old style cluster configuration.

*drmaa://[native_options]/*
*
*
Also, I haven't been able to find a list of native_options anywhere.  Does
such a list exist?  If so, where?
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] DRMAA Runner URL Specify TORQUE host

2013-06-26 Thread Adam Brenner

Bart,

I believe drmaa://-q somehost@queue-name

will work. However I could be very wrong. It has been a while since I
messed with the actual drmaa runners.


--
Adam Brenner
Computer Science, Undergraduate Student
Donald Bren School of Information and Computer Sciences

Research Computing Support
Office of Information Technology
http://www.oit.uci.edu/rcs/

University of California, Irvine
www.ics.uci.edu/~aebrenne/
aebre...@uci.edu

On Wed, Jun 26, 2013 at 1:28 PM, Bart Gottschalk bgott...@umn.edu wrote:
 Is it possible to specify the torque host as part of a DRMAA runner URL?  I
 haven't been able to find a native_options parameter to allow for this.  I'm
 using the old style cluster configuration.

 drmaa://[native_options]/

 Also, I haven't been able to find a list of native_options anywhere.  Does
 such a list exist?  If so, where?

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] drmaa and JSV

2013-04-15 Thread jean-François Taly


Hi,

Our galaxy instance runs jobs in a SGE cluster using 2 job-handlers. The
SGE cluster uses a Job Submission Verifier (JSV) that rejects any job 
submission that specify core

binding strategies.


When Galaxy starts, the first jobs we submit works perfectly:

First job submission:

galaxy.jobs.manager DEBUG 2013-04-15 14:29:59,285 (194) Job assigned to
handler 'handler0' galaxy.jobs DEBUG 2013-04-15 14:29:59,934 (194) 
Working directory for job is: 
/scratch/nfs/galaxy.crg.es/job_working_directory/000/194
galaxy.jobs.handler DEBUG 2013-04-15 14:29:59,942 dispatching job 194 to 
drmaa runner

galaxy.jobs.handler INFO 2013-04-15 14:30:00,166 (194) Job dispatched
galaxy.jobs.runners.drmaa DEBUG 2013-04-15 14:30:00,468 (194) submitting 
file /scratch/nfs/galaxy.crg.es/ogs/galaxy_194.sh
galaxy.jobs.runners.drmaa DEBUG 2013-04-15 14:30:00,468 (194) command 
is: python 
/data/www-bi/apache/galaxy.crg.es/htdocs/galaxy-dist/tools/fastq/fastq_stats.py 
'/data/www-bi/galaxy.crg.es/files/000/dataset_4.dat' 
'/data/www-bi/galaxy.crg.es/files/000/dataset_238.dat' 'sanger'
galaxy.jobs.runners.drmaa INFO 2013-04-15 14:30:01,538 (194) queued as 
458816
galaxy.jobs.runners.drmaa DEBUG 2013-04-15 14:30:02,115 (194/458816) 
state change: job is queued and active



# qstat -cb -j 458816
==
job_number: 458816
exec_file:  job_scripts/458816
submission_time:Mon Apr 15 14:30:01 2013
owner:  www-bi
uid:66401
group:  www-bi
gid:501
sge_o_home: /data/www-bi
sge_o_log_name: www-bi
sge_o_path: 
/data/galaxy/apache/galaxy.crg.es/htdocs/scripts/galaxy-env/bin:/software/galaxy/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/data/www-bi/bin

sge_o_shell:/bin/bash
sge_o_workdir:  
/data/www-bi/apache/galaxy.crg.es/htdocs/galaxy-dist

sge_o_host: galaxy
account:sge
stderr_path_list:   
NONE:galaxy:/scratch/nfs/galaxy.crg.es/job_working_directory/000/194/194.drmerr

reserve:y
hard resource_list: virtual_free=12G,h_rt=21600
mail_list:  www...@galaxy.crg.es
notify: FALSE
job_name:   g194_fastq_stats_jtaly_crg_es
stdout_path_list:   
NONE:galaxy:/scratch/nfs/galaxy.crg.es/job_working_directory/000/194/194.drmout

jobshare:   0
hard_queue_list:www-el6
env_list:
script_file:/scratch/nfs/galaxy.crg.es/ogs/galaxy_194.sh
parallel environment:  smp range: 2
verify_suitable_queues: 2
binding:set linear:2:0,0
scheduling info:queue instance pr-...@fenn.linux.crg.es 
dropped because it is overloaded: np_load_avg=1.70 (= 1.70 + 
0.50 * 0.00 with nproc=12) = 1.7
queue instance 
sh...@node-ib0209bi.linux.crg.es dropped because it is overloaded: 
np_load_avg=2.837500 (= 2.837500 + 0.50 * 0.00 with nproc=8) = 1.3
queue instance 
l...@node-ib0209bi.linux.crg.es dropped because it is overloaded: 
np_load_avg=2.837500 (= 2.837500 + 0.50 * 0.00 with nproc=8) = 1.3



The core binding has been added by our jsv script. This is correct.


But our second submission fails:

galaxy.jobs.runners.drmaa ERROR 2013-04-15 14:30:56,263 Uncaught 
exception queueing job

Traceback (most recent call last):
  File 
/data/www-bi/apache/galaxy.crg.es/htdocs/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py, 
line 144, in run_next

self.queue_job( obj )
  File 
/data/www-bi/apache/galaxy.crg.es/htdocs/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py, 
line 232, in queue_job

job_id = self.ds.runJob(jt)
  File 
/data/www-bi/apache/galaxy.crg.es/htdocs/galaxy-dist/eggs/drmaa-0.4b3-py2.6.egg/drmaa/__init__.py, 
line 331, in runJob

_h.c(_w.drmaa_run_job, jid, _ct.sizeof(jid), jobTemplate)
  File 
/data/www-bi/apache/galaxy.crg.es/htdocs/galaxy-dist/eggs/drmaa-0.4b3-py2.6.egg/drmaa/helpers.py, 
line 213, in c

return f(*(args + (error_buffer, sizeof(error_buffer
  File 
/data/www-bi/apache/galaxy.crg.es/htdocs/galaxy-dist/eggs/drmaa-0.4b3-py2.6.egg/drmaa/errors.py, 
line 90, in error_check

raise _ERRORS[code-1](code %s: %s % (code, error_buffer.value))
DeniedByDrmException: code 17: contact us: x...@xxx.es


if we look at the submited params:

# cat /tmp/qsub_err.txt
$VAR1 = {
  'w' = 'e',
  'N' = 'g195_fastq_stats_jtaly_crg_es',
  'binding_amount' = '2',
  'CMDNAME' = '/scratch/nfs/galaxy.crg.es/ogs/galaxy_195.sh',
  'binding_type' = 'set',
  'M' = {
   'www...@galaxy.crg.es' = undef
 },
  'binding_strategy' = 'linear',
  'l_hard' = {
'virtual_free' = '12G',

Re: [galaxy-dev] DRMAA runner weirdness

2013-01-11 Thread Liisa Koski

Hello,
Can you please post the link to this patch? I do not see it in the mail 
thread and I too have noticed some issues with the DRMAA job running since 
updating to the Oct. 23rd distribution. I don't know if it is related yet 
but I'd like to try the patch to see. I have two local instances of Galaxy 
(prod and dev). On my dev instance (which is fully up to date), when I run 
the same job multiple times, sometimes it finishes and sometimes it dies, 
this is independent of which node it runs on. My prod instance is still at 
the Oct. 03 distribution and does not experience this problem. So I am 
afraid to update our production instance. 

Thanks in advance,
Liisa




From:   Kyle Ellrott kellr...@soe.ucsc.edu
To: Nate Coraor n...@bx.psu.edu
Cc: galaxy-dev@lists.bx.psu.edu galaxy-dev@lists.bx.psu.edu
Date:   10/01/2013 07:44 PM
Subject:Re: [galaxy-dev] DRMAA runner weirdness
Sent by:galaxy-dev-boun...@lists.bx.psu.edu



I did a merge of galaxy-central that included the patch you posted 
today. The scheduling problem seems to have gone away. Although I'm still 
getting back 'Job output not returned from cluster' for errors. This seems 
odd, as the system previously would output stderr correctly.

Kyle


On Thu, Jan 10, 2013 at 8:30 AM, Nate Coraor n...@bx.psu.edu wrote:
On Jan 9, 2013, at 12:18 AM, Kyle Ellrott wrote:

 I'm running a test Galaxy system on a cluster (merged galaxy-dist on 
Janurary 4th). And I've noticed some odd behavior from the DRMAA job 
runner.
 I'm running a multithread system, one web server, one job_manager, and 
three job_handlers. DRMAA is the default job runner (the command for 
tophat2 is drmaa://-V -l mem_total=7G -pe smp 2/), with SGE 6.2u5 being 
the engine underneath.

 My test involves trying to run three different Tophat2 jobs. The first 
two seem to start up (and get put on the SGE queue), but the third stays 
grey, with the job manager listing it in state 'new' with command line 
'None'. It doesn't seem to leave this state. Both of the jobs that 
actually got onto the queue die (reasons unknown, but much to early, 
probably some tophat/bowtie problem), but one job is listed in error state 
with stderr as 'Job output not returned from cluster', while the other job 
(which is no longer in the SGE queue) is still listed as running.

Hi Kyle,

It sounds like there are bunch of issues here.  Do you have any limits set 
as to the number of concurrent jobs allowed?  If not, you may need to add 
a bit of debugging information to the manager or handler code to figure 
out why the 'new' job is not being dispatched for execution.

For the 'error' job, more information about output collection should be 
available from the Galaxy server log.  If you have general SGE problems 
this may not be Galaxy's fault.  You do need to make sure that the 
stdout/stderr files are able to be properly copied back to the Galaxy 
server upon job completion.

For the 'running' job, make sure you've got 'set_metadata_externally = 
True' in your Galaxy config.

--nate


 Any ideas?


 Kyle
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] DRMAA runner weirdness

2013-01-08 Thread Kyle Ellrott

I'm running a test Galaxy system on a cluster (merged galaxy-dist on
Janurary 4th). And I've noticed some odd behavior from the DRMAA job
runner.
I'm running a multithread system, one web server, one job_manager, and
three job_handlers. DRMAA is the default job runner (the command for
tophat2 is drmaa://-V -l mem_total=7G -pe smp 2/), with SGE 6.2u5 being the
engine underneath.

My test involves trying to run three different Tophat2 jobs. The first two
seem to start up (and get put on the SGE queue), but the third stays grey,
with the job manager listing it in state 'new' with command line 'None'. It
doesn't seem to leave this state. Both of the jobs that actually got onto
the queue die (reasons unknown, but much to early, probably some
tophat/bowtie problem), but one job is listed in error state with stderr as
'Job output not returned from cluster', while the other job (which is no
longer in the SGE queue) is still listed as running.

Any ideas?

Kyle
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] DRMAA job will now be errored - Segmentation fault

2012-11-27 Thread Nate Coraor

On Nov 20, 2012, at 8:15 AM, Peter Cock wrote:

 On Thu, Nov 15, 2012 at 11:21 AM, Peter Cock p.j.a.c...@googlemail.com 
 wrote:
 On Thu, Nov 15, 2012 at 10:12 AM, Peter Cock p.j.a.c...@googlemail.com 
 wrote:
 On Thu, Nov 15, 2012 at 10:06 AM, Peter Cock p.j.a.c...@googlemail.com 
 wrote:
 Hi all,
 
 Something has changed in the job handling, and in a bad way. On my
 development machine submitting jobs to the cluster didn't seem to be
 working anymore (never sent to SGE). I killed Galaxy and restarted:
 ...
 (segmentation fault)
 
 Looking into the problem with submitting the jobs, there seems to be
 a problem with task splitting somehow recursing - the same file is
 split four times, the filename getting longer and longer:
 
 Turning off task splitting I could run the same job OK on SGE.
 
 So, the good news is the problems seem to be specific to the
 task splitting code. Also I have reproduced the segmentation
 fault when restarting Galaxy (after stopping Galaxy with one
 of these broken jobs).
 
 Starting server in PID 17996.
 serving on http://127.0.0.1:8081
 galaxy.jobs.runners.drmaa ERROR 2012-11-15 11:07:27,762 (327/None)
 Unable to check job status
 Traceback (most recent call last):
  File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py,
 line 296, in check_watched_items
state = self.ds.jobStatus( job_id )
  File 
 /mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/__init__.py,
 line 522, in jobStatus
_h.c(_w.drmaa_job_ps, jobName, _ct.byref(status))
  File 
 /mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/helpers.py,
 line 213, in c
return f(*(args + (error_buffer, sizeof(error_buffer
  File 
 /mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/errors.py,
 line 90, in error_check
raise _ERRORS[code-1](code %s: %s % (code, error_buffer.value))
 InvalidArgumentException: code 4: Job id, None, is not a valid job id
 galaxy.jobs.runners.drmaa WARNING 2012-11-15 11:07:27,764 (327/None)
 job will now be errored
 ./run.sh: line 86: 17996 Segmentation fault  (core dumped) python
 ./scripts/paster.py serve universe_wsgi.ini $@
 
 The problem is the job_id variable is None (note this is a string,
 not the Python special object None) in check_watched_items().
 
 Peter
 
 Is anyone else seeing this? I am wary of applying the update to our
 production Galaxy until I know how to resolve this (other than just
 be disabling task splitting).

Hi Peter,

These look like two issues - in one, you've got task(s) in the database that do 
not have an external runner ID set, causing the drmaa runner to attempt to 
check the status of None, resulting in the segfault.  If you update the state 
of these tasks to something terminal, that should fix the issue with them.  Of 
course, if the same things happens with new jobs, then there's another issue.

I'm trying to reproduce the working directory behavior but have been 
unsuccessful.  Do you have any local modifications to the splitting or jobs 
code?

--nate

 
 Thanks,
 
 Peter
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
  http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] DRMAA job will now be errored - Segmentation fault

2012-11-27 Thread Nate Coraor

On Nov 27, 2012, at 12:03 PM, Peter Cock wrote:

 On Tue, Nov 27, 2012 at 4:50 PM, Nate Coraor n...@bx.psu.edu wrote:
 On Nov 20, 2012, at 8:15 AM, Peter Cock wrote:
 
 Is anyone else seeing this? I am wary of applying the update to our
 production Galaxy until I know how to resolve this (other than just
 be disabling task splitting).
 
 Hi Peter,
 
 These look like two issues - in one, you've got task(s) in the database
 that do not have an external runner ID set, causing the drmaa runner
 to attempt to check the status of None, resulting in the segfault.
 
 So a little defensive coding could prevent the segfault then (leaving
 the separate issue of why the jobs lack this information)?

Indeed, I pushed a check for this in 4a95ae9a26d9.

 If you update the state of these tasks to something terminal, that
 should fix the issue with them.
 
 You mean manually in the database? Restarting Galaxy seemed
 to achieve that in a round-about way.
 
 Of course, if the same things happens with new jobs, then there's
 another issue.
 
 This was a week ago, but yes, at the time it was reproducible
 with new jobs.

Is that to say it's still happening and you've simply worked around it (by 
disabling tasks), or that it is no longer happening?

 I'm trying to reproduce the working directory behavior but have
 been unsuccessful.  Do you have any local modifications to the
 splitting or jobs code?
 
 This was running on my tools branch, which shouldn't be changing
 Galaxy itself in any meaningful way (a few local variables did get
 accidentally checked into my run.sh file etc but otherwise I only
 try to modify new files specific to my individual tool wrappers):
 
 https://bitbucket.org/peterjc/galaxy-central/src/tools
 
 [galaxy@ppserver galaxy-central]$ hg branch
 tools
 
 [galaxy@ppserver galaxy-central]$ hg log -b tools | head -n 8
 changeset:   8807:d49200df0707
 branch:  tools
 tag: tip
 parent:  8712:959ee7c79fd2
 parent:  8806:340438c62171
 user:peterjc p.j.a.c...@googlemail.com
 date:Thu Nov 15 09:38:57 2012 +
 summary: Merged default into my tools branch
 
 The only deliberate change was to try and debug this,
 
 [galaxy@ppserver galaxy-central]$ hg diff
 diff -r d49200df0707 lib/galaxy/jobs/runners/drmaa.py
 --- a/lib/galaxy/jobs/runners/drmaa.pyThu Nov 15 09:38:57 2012 +
 +++ b/lib/galaxy/jobs/runners/drmaa.pyTue Nov 27 17:00:04 2012 +
 @@ -291,8 +291,15 @@
 for drm_job_state in self.watched:
 job_id = drm_job_state.job_id
 galaxy_job_id = drm_job_state.job_wrapper.job_id
 +if job_id is None or job_id==None:
 +log.exception((%s/%r) Unable to check job status
 none % ( galaxy_job_id, job_id ) )
 +#drm_job_state.fail_message = Cluster could not
 complete job (job_id None)
 +#Ignore it?
 +#self.work_queue.put( ( 'fail', drm_job_state ) )
 +continue
 old_state = drm_job_state.old_state
 try:
 +assert job_id is not None and job_id != None
 state = self.ds.jobStatus( job_id )
 # InternalException was reported to be necessary on some DRMs, but
 # this could cause failures to be detected as completion!  Please
 
 I'm about to go home for the day but should be able to look
 into this tomorrow, e.g. update to the latest default branch.

Great, thanks.

--nate

 
 Thanks,
 
 Peter


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] DRMAA job will now be errored - Segmentation fault

2012-11-27 Thread Peter Cock

On Tue, Nov 27, 2012 at 5:19 PM, Nate Coraor n...@bx.psu.edu wrote:

 So a little defensive coding could prevent the segfault then (leaving
 the separate issue of why the jobs lack this information)?

 Indeed, I pushed a check for this in 4a95ae9a26d9.

Great. That will help.

 This was a week ago, but yes, at the time it was reproducible
 with new jobs.

 Is that to say it's still happening and you've simply worked
 around it (by disabling tasks), or that it is no longer happening?

I've not tried it for a week - it was my development install that
tracks galaxy-central which showed the problem, so I avoided
updated our production install until resolving it.

I'll see how it behaves later this week (although there is a
mass job hogging the cluster queue which may complicate
matters and reduce the turn around).

Thanks,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] DRMAA job will now be errored - Segmentation fault

2012-11-20 Thread Peter Cock

On Thu, Nov 15, 2012 at 11:21 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 On Thu, Nov 15, 2012 at 10:12 AM, Peter Cock p.j.a.c...@googlemail.com 
 wrote:
 On Thu, Nov 15, 2012 at 10:06 AM, Peter Cock p.j.a.c...@googlemail.com 
 wrote:
 Hi all,

 Something has changed in the job handling, and in a bad way. On my
 development machine submitting jobs to the cluster didn't seem to be
 working anymore (never sent to SGE). I killed Galaxy and restarted:
 ...
 (segmentation fault)

 Looking into the problem with submitting the jobs, there seems to be
 a problem with task splitting somehow recursing - the same file is
 split four times, the filename getting longer and longer:

 Turning off task splitting I could run the same job OK on SGE.

 So, the good news is the problems seem to be specific to the
 task splitting code. Also I have reproduced the segmentation
 fault when restarting Galaxy (after stopping Galaxy with one
 of these broken jobs).

 Starting server in PID 17996.
 serving on http://127.0.0.1:8081
 galaxy.jobs.runners.drmaa ERROR 2012-11-15 11:07:27,762 (327/None)
 Unable to check job status
 Traceback (most recent call last):
   File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py,
 line 296, in check_watched_items
 state = self.ds.jobStatus( job_id )
   File 
 /mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/__init__.py,
 line 522, in jobStatus
 _h.c(_w.drmaa_job_ps, jobName, _ct.byref(status))
   File 
 /mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/helpers.py,
 line 213, in c
 return f(*(args + (error_buffer, sizeof(error_buffer
   File 
 /mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/errors.py,
 line 90, in error_check
 raise _ERRORS[code-1](code %s: %s % (code, error_buffer.value))
 InvalidArgumentException: code 4: Job id, None, is not a valid job id
 galaxy.jobs.runners.drmaa WARNING 2012-11-15 11:07:27,764 (327/None)
 job will now be errored
 ./run.sh: line 86: 17996 Segmentation fault  (core dumped) python
 ./scripts/paster.py serve universe_wsgi.ini $@

 The problem is the job_id variable is None (note this is a string,
 not the Python special object None) in check_watched_items().

 Peter

Is anyone else seeing this? I am wary of applying the update to our
production Galaxy until I know how to resolve this (other than just
be disabling task splitting).

Thanks,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] DRMAA job will now be errored - Segmentation fault

2012-11-15 Thread Peter Cock

Hi all,

Something has changed in the job handling, and in a bad way. On my
development machine submitting jobs to the cluster didn't seem to be
working anymore (never sent to SGE). I killed Galaxy and restarted:

Starting server in PID 12180.
serving on http://127.0.0.1:8081
galaxy.jobs.runners.drmaa ERROR 2012-11-15 09:56:28,192 (320/None)
Unable to check job status
Traceback (most recent call last):
  File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py,
line 296, in check_watched_items
state = self.ds.jobStatus( job_id )
  File 
/mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/__init__.py,
line 522, in jobStatus
_h.c(_w.drmaa_job_ps, jobName, _ct.byref(status))
  File /mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/helpers.py,
line 213, in c
return f(*(args + (error_buffer, sizeof(error_buffer
  File /mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/errors.py,
line 90, in error_check
raise _ERRORS[code-1](code %s: %s % (code, error_buffer.value))
InvalidArgumentException: code 4: Job id, None, is not a valid job id
galaxy.jobs.runners.drmaa WARNING 2012-11-15 09:56:28,193 (320/None)
job will now be errored
./run.sh: line 86: 12180 Segmentation fault  (core dumped) python
./scripts/paster.py serve universe_wsgi.ini $@

I restarted and it happened again, third time lucky. I presume this was
one segmentation fault for each orphaned/zombie job (since I'd tried
two cluster jobs which got stuck).

I was running with revision 340438c62171,
https://bitbucket.org/galaxy/galaxy-central/changeset/340438c62171578078323d39da398d5053b69d0a
as merged into my tools branch,
https://bitbucket.org/peterjc/galaxy-central/changeset/d49200df0707579f41fc4f25042354604ce20e63

Any thoughts?

Thanks,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] DRMAA job will now be errored - Segmentation fault

2012-11-15 Thread Peter Cock

On Thu, Nov 15, 2012 at 10:06 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 Hi all,

 Something has changed in the job handling, and in a bad way. On my
 development machine submitting jobs to the cluster didn't seem to be
 working anymore (never sent to SGE). I killed Galaxy and restarted:
 ...
 (segmentation fault)

Looking into the problem with submitting the jobs, there seems to be
a problem with task splitting somehow recursing - the same file is
split four times, the filename getting longer and longer:

galaxy.jobs DEBUG 2012-11-15 10:08:33,510 (321) Working directory for
job is: /mnt/galaxy/galaxy-central/database/job_working_directory/000/321
galaxy.jobs.handler DEBUG 2012-11-15 10:08:33,510 dispatching job 321
to tasks runner
galaxy.jobs.handler INFO 2012-11-15 10:08:33,714 (321) Job dispatched
galaxy.datatypes.sequence DEBUG 2012-11-15 10:08:34,457 Split
/mnt/galaxy/galaxy-central/database/files/000/dataset_344.dat into
batches of 1000 records...
galaxy.datatypes.sequence DEBUG 2012-11-15 10:08:34,457 Attemping to
split FASTA file
/mnt/galaxy/galaxy-central/database/files/000/dataset_344.dat into
chunks of 1000 sequences
galaxy.datatypes.sequence DEBUG 2012-11-15 10:08:34,458 Writing
/mnt/galaxy/galaxy-central/database/files/000/dataset_344.dat part to
/mnt/galaxy/galaxy-central/database/job_working_directory/000/321/task_0/dataset_344.dat
galaxy.jobs.splitters.multi DEBUG 2012-11-15 10:08:34,458 do_split
created 1 parts
galaxy.jobs DEBUG 2012-11-15 10:08:34,558 (321) Working directory for
job is: /mnt/galaxy/galaxy-central/database/job_working_directory/000/321
galaxy.jobs.handler DEBUG 2012-11-15 10:08:34,558 dispatching task
823, of job 321, to tasks runner
127.0.0.1 - - [15/Nov/2012:10:08:35 +0100] POST
/root/history_item_updates HTTP/1.1 200 -
http://127.0.0.1:8081/history; Mozilla/5.0 (X11; Linux x86_64;
rv:10.0.8) Gecko/20121012 Firefox/10.0.8
galaxy.datatypes.sequence DEBUG 2012-11-15 10:08:35,458 Split
/mnt/galaxy/galaxy-central/database/files/000/dataset_344.dat into
batches of 1000 records...
galaxy.datatypes.sequence DEBUG 2012-11-15 10:08:35,459 Attemping to
split FASTA file
/mnt/galaxy/galaxy-central/database/files/000/dataset_344.dat into
chunks of 1000 sequences
galaxy.datatypes.sequence DEBUG 2012-11-15 10:08:35,459 Writing
/mnt/galaxy/galaxy-central/database/files/000/dataset_344.dat part to
/mnt/galaxy/galaxy-central/database/job_working_directory/000/321/task_0/task_0/dataset_344.dat
galaxy.jobs.splitters.multi DEBUG 2012-11-15 10:08:35,459 do_split
created 1 parts
galaxy.jobs DEBUG 2012-11-15 10:08:35,541 (321) Working directory for
job is: /mnt/galaxy/galaxy-central/database/job_working_directory/000/321
galaxy.jobs.handler DEBUG 2012-11-15 10:08:35,542 dispatching task
824, of job 321, to tasks runner
galaxy.datatypes.sequence DEBUG 2012-11-15 10:08:36,171 Split
/mnt/galaxy/galaxy-central/database/files/000/dataset_344.dat into
batches of 1000 records...
galaxy.datatypes.sequence DEBUG 2012-11-15 10:08:36,171 Attemping to
split FASTA file
/mnt/galaxy/galaxy-central/database/files/000/dataset_344.dat into
chunks of 1000 sequences
galaxy.datatypes.sequence DEBUG 2012-11-15 10:08:36,171 Writing
/mnt/galaxy/galaxy-central/database/files/000/dataset_344.dat part to
/mnt/galaxy/galaxy-central/database/job_working_directory/000/321/task_0/task_0/task_0/dataset_344.dat
galaxy.jobs.splitters.multi DEBUG 2012-11-15 10:08:36,172 do_split
created 1 parts
galaxy.jobs DEBUG 2012-11-15 10:08:36,232 (321) Working directory for
job is: /mnt/galaxy/galaxy-central/database/job_working_directory/000/321
galaxy.jobs.handler DEBUG 2012-11-15 10:08:36,232 dispatching task
825, of job 321, to tasks runner
galaxy.datatypes.sequence DEBUG 2012-11-15 10:08:36,843 Split
/mnt/galaxy/galaxy-central/database/files/000/dataset_344.dat into
batches of 1000 records...
galaxy.datatypes.sequence DEBUG 2012-11-15 10:08:36,843 Attemping to
split FASTA file
/mnt/galaxy/galaxy-central/database/files/000/dataset_344.dat into
chunks of 1000 sequences
galaxy.datatypes.sequence DEBUG 2012-11-15 10:08:36,843 Writing
/mnt/galaxy/galaxy-central/database/files/000/dataset_344.dat part to
/mnt/galaxy/galaxy-central/database/job_working_directory/000/321/task_0/task_0/task_0/task_0/dataset_344.dat
galaxy.jobs.splitters.multi DEBUG 2012-11-15 10:08:36,844 do_split
created 1 parts
galaxy.jobs DEBUG 2012-11-15 10:08:36,906 (321) Working directory for
job is: /mnt/galaxy/galaxy-central/database/job_working_directory/000/321
galaxy.jobs.handler DEBUG 2012-11-15 10:08:36,906 dispatching task
826, of job 321, to tasks runner

Hmm.

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] DRMAA job will now be errored - Segmentation fault

2012-11-15 Thread Peter Cock

On Thu, Nov 15, 2012 at 10:12 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 On Thu, Nov 15, 2012 at 10:06 AM, Peter Cock p.j.a.c...@googlemail.com 
 wrote:
 Hi all,

 Something has changed in the job handling, and in a bad way. On my
 development machine submitting jobs to the cluster didn't seem to be
 working anymore (never sent to SGE). I killed Galaxy and restarted:
 ...
 (segmentation fault)

 Looking into the problem with submitting the jobs, there seems to be
 a problem with task splitting somehow recursing - the same file is
 split four times, the filename getting longer and longer:

Turning off task splitting I could run the same job OK on SGE.

So, the good news is the problems seem to be specific to the
task splitting code. Also I have reproduced the segmentation
fault when restarting Galaxy (after stopping Galaxy with one
of these broken jobs).

Starting server in PID 17996.
serving on http://127.0.0.1:8081
galaxy.jobs.runners.drmaa ERROR 2012-11-15 11:07:27,762 (327/None)
Unable to check job status
Traceback (most recent call last):
  File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py,
line 296, in check_watched_items
state = self.ds.jobStatus( job_id )
  File 
/mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/__init__.py,
line 522, in jobStatus
_h.c(_w.drmaa_job_ps, jobName, _ct.byref(status))
  File /mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/helpers.py,
line 213, in c
return f(*(args + (error_buffer, sizeof(error_buffer
  File /mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/errors.py,
line 90, in error_check
raise _ERRORS[code-1](code %s: %s % (code, error_buffer.value))
InvalidArgumentException: code 4: Job id, None, is not a valid job id
galaxy.jobs.runners.drmaa WARNING 2012-11-15 11:07:27,764 (327/None)
job will now be errored
./run.sh: line 86: 17996 Segmentation fault  (core dumped) python
./scripts/paster.py serve universe_wsgi.ini $@

The problem is the job_id variable is None (note this is a string,
not the Python special object None) in check_watched_items().

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] DRMAA: TypeError: check_tool_output() takes exactly 5 arguments (4 given)

2012-09-19 Thread Peter Cock

On Tue, Sep 18, 2012 at 7:11 PM, Scott McManus scottmcma...@gatech.edu wrote:
 Sorry - that's changeset 7714:3f12146d6d81

 -Scott

Hi Scott,

The good news is this error does seem to be fixed as of that commit:

TypeError: check_tool_output() takes exactly 5 arguments (4 given)

The bad news is my cluster jobs still aren't working properly (using
a job splitter). The jobs seem to run, get submitted to the cluster,
and finish, and the data looks OK via the 'eye' view icon, but is
red in the history with:

0 bytes
An error occurred running this job: info unavailable

I will investigate - it is likely due to another change... perhaps in
the new stdout/stderr/return code support?

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] DRMAA: TypeError: check_tool_output() takes exactly 5 arguments (4 given)

2012-09-19 Thread Jorrit Boekel


Odd, it works for me on EC2/Cloudman.

jorrit

On 09/19/2012 03:29 PM, Peter Cock wrote:

On Tue, Sep 18, 2012 at 7:11 PM, Scott McManus scottmcma...@gatech.edu wrote:

Sorry - that's changeset 7714:3f12146d6d81

-Scott

Hi Scott,

The good news is this error does seem to be fixed as of that commit:

TypeError: check_tool_output() takes exactly 5 arguments (4 given)

The bad news is my cluster jobs still aren't working properly (using
a job splitter). The jobs seem to run, get submitted to the cluster,
and finish, and the data looks OK via the 'eye' view icon, but is
red in the history with:

0 bytes
An error occurred running this job: info unavailable

I will investigate - it is likely due to another change... perhaps in
the new stdout/stderr/return code support?

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

   http://lists.bx.psu.edu/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/

[galaxy-dev] DRMAA: TypeError: check_tool_output() takes exactly 5 arguments (4 given)

2012-09-18 Thread Peter Cock

Hi all (and in particular, Scott),

I've just updated my development server and found the following
error when running jobs on our SGE cluster via DRMMA:

galaxy.jobs.runners.drmaa ERROR 2012-09-18 09:43:20,698 Job wrapper
finish method failed
Traceback (most recent call last):
  File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py,
line 371, in finish_job
drm_job_state.job_wrapper.finish( stdout, stderr, exit_code )
  File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/__init__.py, line
1048, in finish
if ( self.check_tool_output( stdout, stderr, tool_exit_code ) ):
TypeError: check_tool_output() takes exactly 5 arguments (4 given)

This looks to have been introduced in this commit:
https://bitbucket.org/galaxy/galaxy-central/changeset/f557b7b05fdd701cbf99ee04f311bcadb1ae29c4#chg-lib/galaxy/jobs/__init__.py

There should be an additional jobs argument, proposed fix:

$ hg diff lib/galaxy/jobs/__init__.py
diff -r 4007494e37e1 lib/galaxy/jobs/__init__.py
--- a/lib/galaxy/jobs/__init__.py   Tue Sep 18 09:40:19 2012 +0100
+++ b/lib/galaxy/jobs/__init__.py   Tue Sep 18 10:06:44 2012 +0100
@@ -1045,7 +1045,8 @@
 # Check what the tool returned. If the stdout or stderr matched
 # regular expressions that indicate errors, then set an error.
 # The same goes if the tool's exit code was in a given range.
-if ( self.check_tool_output( stdout, stderr, tool_exit_code ) ):
+job = self.get_job()
+if ( self.check_tool_output( stdout, stderr, tool_exit_code, job ) ):
 task.state = task.states.OK
 else:
 task.state = task.states.ERROR


(Let me know if you want this as a pull request - it seems a lot of
effort for a tiny change.)

Regards,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] DRMAA: TypeError: check_tool_output() takes exactly 5 arguments (4 given)

2012-09-18 Thread Scott McManus


I'll check it out. Thanks.

- Original Message -
 Hi all (and in particular, Scott),
 
 I've just updated my development server and found the following
 error when running jobs on our SGE cluster via DRMMA:
 
 galaxy.jobs.runners.drmaa ERROR 2012-09-18 09:43:20,698 Job wrapper
 finish method failed
 Traceback (most recent call last):
   File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py,
 line 371, in finish_job
 drm_job_state.job_wrapper.finish( stdout, stderr, exit_code )
   File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/__init__.py, line
 1048, in finish
 if ( self.check_tool_output( stdout, stderr, tool_exit_code ) ):
 TypeError: check_tool_output() takes exactly 5 arguments (4 given)
 
 This looks to have been introduced in this commit:
 https://bitbucket.org/galaxy/galaxy-central/changeset/f557b7b05fdd701cbf99ee04f311bcadb1ae29c4#chg-lib/galaxy/jobs/__init__.py
 
 There should be an additional jobs argument, proposed fix:
 
 $ hg diff lib/galaxy/jobs/__init__.py
 diff -r 4007494e37e1 lib/galaxy/jobs/__init__.py
 --- a/lib/galaxy/jobs/__init__.py Tue Sep 18 09:40:19 2012 +0100
 +++ b/lib/galaxy/jobs/__init__.py Tue Sep 18 10:06:44 2012 +0100
 @@ -1045,7 +1045,8 @@
  # Check what the tool returned. If the stdout or stderr
  matched
  # regular expressions that indicate errors, then set an
  error.
  # The same goes if the tool's exit code was in a given
  range.
 -if ( self.check_tool_output( stdout, stderr, tool_exit_code
 ) ):
 +job = self.get_job()
 +if ( self.check_tool_output( stdout, stderr, tool_exit_code,
 job ) ):
  task.state = task.states.OK
  else:
  task.state = task.states.ERROR
 
 
 (Let me know if you want this as a pull request - it seems a lot of
 effort for a tiny change.)
 
 Regards,
 
 Peter
 
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] DRMAA: TypeError: check_tool_output() takes exactly 5 arguments (4 given)

2012-09-18 Thread Scott McManus


I have to admit that I'm a little confused as to why you would
be getting this error at all - the job variable is introduced 
at line 298 in the same file, and it's used as the last variable
to check_tool_output in the changeset you pointed to. 
(Also, thanks for pointing to it - that made investigating easier.)

Is it possible that there was a merge problem when you pulled the
latest set of code? For my own sanity, would you mind downloading 
a fresh copy of galaxy-central or galaxy-dist into a separate 
directory and see if the problem is still there? (I fully admit 
that there could be a bug that I left in, but all job runners 
should have stumbled across the same problem - the finish method
should be called by all job runners.)

Thanks again!

-Scott

- Original Message -
 
 I'll check it out. Thanks.
 
 - Original Message -
  Hi all (and in particular, Scott),
  
  I've just updated my development server and found the following
  error when running jobs on our SGE cluster via DRMMA:
  
  galaxy.jobs.runners.drmaa ERROR 2012-09-18 09:43:20,698 Job wrapper
  finish method failed
  Traceback (most recent call last):
File
/mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py,
  line 371, in finish_job
  drm_job_state.job_wrapper.finish( stdout, stderr, exit_code )
File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/__init__.py,
line
  1048, in finish
  if ( self.check_tool_output( stdout, stderr, tool_exit_code )
  ):
  TypeError: check_tool_output() takes exactly 5 arguments (4 given)
  
  This looks to have been introduced in this commit:
  https://bitbucket.org/galaxy/galaxy-central/changeset/f557b7b05fdd701cbf99ee04f311bcadb1ae29c4#chg-lib/galaxy/jobs/__init__.py
  
  There should be an additional jobs argument, proposed fix:
  
  $ hg diff lib/galaxy/jobs/__init__.py
  diff -r 4007494e37e1 lib/galaxy/jobs/__init__.py
  --- a/lib/galaxy/jobs/__init__.py   Tue Sep 18 09:40:19 2012 +0100
  +++ b/lib/galaxy/jobs/__init__.py   Tue Sep 18 10:06:44 2012 +0100
  @@ -1045,7 +1045,8 @@
   # Check what the tool returned. If the stdout or stderr
   matched
   # regular expressions that indicate errors, then set an
   error.
   # The same goes if the tool's exit code was in a given
   range.
  -if ( self.check_tool_output( stdout, stderr,
  tool_exit_code
  ) ):
  +job = self.get_job()
  +if ( self.check_tool_output( stdout, stderr,
  tool_exit_code,
  job ) ):
   task.state = task.states.OK
   else:
   task.state = task.states.ERROR
  
  
  (Let me know if you want this as a pull request - it seems a lot of
  effort for a tiny change.)
  
  Regards,
  
  Peter
  
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
   http://lists.bx.psu.edu/
 
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] DRMAA: TypeError: check_tool_output() takes exactly 5 arguments (4 given)

2012-09-18 Thread Jorrit Boekel

Is it possible that you are looking at different classes? TaskWrapper's 
finish method does not use the job variable in my recently merged code 
either (line ~1045), while JobWrapper's does around line 315.


cheers,
jorrit




On 09/18/2012 03:55 PM, Scott McManus wrote:

I have to admit that I'm a little confused as to why you would
be getting this error at all - the job variable is introduced
at line 298 in the same file, and it's used as the last variable
to check_tool_output in the changeset you pointed to.
(Also, thanks for pointing to it - that made investigating easier.)

Is it possible that there was a merge problem when you pulled the
latest set of code? For my own sanity, would you mind downloading
a fresh copy of galaxy-central or galaxy-dist into a separate
directory and see if the problem is still there? (I fully admit
that there could be a bug that I left in, but all job runners
should have stumbled across the same problem - the finish method
should be called by all job runners.)

Thanks again!

-Scott

- Original Message -

I'll check it out. Thanks.

- Original Message -

Hi all (and in particular, Scott),

I've just updated my development server and found the following
error when running jobs on our SGE cluster via DRMMA:

galaxy.jobs.runners.drmaa ERROR 2012-09-18 09:43:20,698 Job wrapper
finish method failed
Traceback (most recent call last):
   File
   /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py,
line 371, in finish_job
 drm_job_state.job_wrapper.finish( stdout, stderr, exit_code )
   File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/__init__.py,
   line
1048, in finish
 if ( self.check_tool_output( stdout, stderr, tool_exit_code )
 ):
TypeError: check_tool_output() takes exactly 5 arguments (4 given)

This looks to have been introduced in this commit:
https://bitbucket.org/galaxy/galaxy-central/changeset/f557b7b05fdd701cbf99ee04f311bcadb1ae29c4#chg-lib/galaxy/jobs/__init__.py

There should be an additional jobs argument, proposed fix:

$ hg diff lib/galaxy/jobs/__init__.py
diff -r 4007494e37e1 lib/galaxy/jobs/__init__.py
--- a/lib/galaxy/jobs/__init__.py   Tue Sep 18 09:40:19 2012 +0100
+++ b/lib/galaxy/jobs/__init__.py   Tue Sep 18 10:06:44 2012 +0100
@@ -1045,7 +1045,8 @@
  # Check what the tool returned. If the stdout or stderr
  matched
  # regular expressions that indicate errors, then set an
  error.
  # The same goes if the tool's exit code was in a given
  range.
-if ( self.check_tool_output( stdout, stderr,
tool_exit_code
) ):
+job = self.get_job()
+if ( self.check_tool_output( stdout, stderr,
tool_exit_code,
job ) ):
  task.state = task.states.OK
  else:
  task.state = task.states.ERROR


(Let me know if you want this as a pull request - it seems a lot of
effort for a tiny change.)

Regards,

Peter


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

   http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

   http://lists.bx.psu.edu/



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/

Re: [galaxy-dev] DRMAA: TypeError: check_tool_output() takes exactly 5 arguments (4 given)

2012-09-18 Thread Scott McManus


Thanks, Jorrit! That was a good catch. Yes, it's a problem with the TaskWrapper.
I'll see what I can do about it.

-Scott

- Original Message -
 Is it possible that you are looking at different classes?
 TaskWrapper's
 finish method does not use the job variable in my recently merged
 code
 either (line ~1045), while JobWrapper's does around line 315.
 
 cheers,
 jorrit
 
 
 
 
 On 09/18/2012 03:55 PM, Scott McManus wrote:
  I have to admit that I'm a little confused as to why you would
  be getting this error at all - the job variable is introduced
  at line 298 in the same file, and it's used as the last variable
  to check_tool_output in the changeset you pointed to.
  (Also, thanks for pointing to it - that made investigating easier.)
 
  Is it possible that there was a merge problem when you pulled the
  latest set of code? For my own sanity, would you mind downloading
  a fresh copy of galaxy-central or galaxy-dist into a separate
  directory and see if the problem is still there? (I fully admit
  that there could be a bug that I left in, but all job runners
  should have stumbled across the same problem - the finish method
  should be called by all job runners.)
 
  Thanks again!
 
  -Scott
 
  - Original Message -
  I'll check it out. Thanks.
 
  - Original Message -
  Hi all (and in particular, Scott),
 
  I've just updated my development server and found the following
  error when running jobs on our SGE cluster via DRMMA:
 
  galaxy.jobs.runners.drmaa ERROR 2012-09-18 09:43:20,698 Job
  wrapper
  finish method failed
  Traceback (most recent call last):
 File
 /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py,
  line 371, in finish_job
   drm_job_state.job_wrapper.finish( stdout, stderr, exit_code
   )
 File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/__init__.py,
 line
  1048, in finish
   if ( self.check_tool_output( stdout, stderr, tool_exit_code
   )
   ):
  TypeError: check_tool_output() takes exactly 5 arguments (4
  given)
 
  This looks to have been introduced in this commit:
  https://bitbucket.org/galaxy/galaxy-central/changeset/f557b7b05fdd701cbf99ee04f311bcadb1ae29c4#chg-lib/galaxy/jobs/__init__.py
 
  There should be an additional jobs argument, proposed fix:
 
  $ hg diff lib/galaxy/jobs/__init__.py
  diff -r 4007494e37e1 lib/galaxy/jobs/__init__.py
  --- a/lib/galaxy/jobs/__init__.py Tue Sep 18 09:40:19 2012 +0100
  +++ b/lib/galaxy/jobs/__init__.py Tue Sep 18 10:06:44 2012 +0100
  @@ -1045,7 +1045,8 @@
# Check what the tool returned. If the stdout or stderr
matched
# regular expressions that indicate errors, then set an
error.
# The same goes if the tool's exit code was in a given
range.
  -if ( self.check_tool_output( stdout, stderr,
  tool_exit_code
  ) ):
  +job = self.get_job()
  +if ( self.check_tool_output( stdout, stderr,
  tool_exit_code,
  job ) ):
task.state = task.states.OK
else:
task.state = task.states.ERROR
 
 
  (Let me know if you want this as a pull request - it seems a lot
  of
  effort for a tiny change.)
 
  Regards,
 
  Peter
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
 
 http://lists.bx.psu.edu/
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
 
 http://lists.bx.psu.edu/
 
 
 
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] DRMAA: TypeError: check_tool_output() takes exactly 5 arguments (4 given)

2012-09-18 Thread Peter Cock

On Tue, Sep 18, 2012 at 3:09 PM, Jorrit Boekel
jorrit.boe...@scilifelab.se wrote:
 Is it possible that you are looking at different classes? TaskWrapper's
 finish method does not use the job variable in my recently merged code
 either (line ~1045), while JobWrapper's does around line 315.

 cheers,
 jorrit

Yes exactly (as per my follow up email sent just before yours ;) )

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] DRMAA: TypeError: check_tool_output() takes exactly 5 arguments (4 given)

2012-09-18 Thread Scott McManus


Ok - that change was made. The difference is that the change
is applied to the task instead of the job. It's in changeset
7713:bfd10aa67c78, and it ran successfully in my environments
on local, pbs, and drmaa runners. Let me know if there are 
any problems.

Thanks again for your patience.

-Scott

- Original Message -
 On Tue, Sep 18, 2012 at 3:09 PM, Jorrit Boekel
 jorrit.boe...@scilifelab.se wrote:
  Is it possible that you are looking at different classes?
  TaskWrapper's
  finish method does not use the job variable in my recently merged
  code
  either (line ~1045), while JobWrapper's does around line 315.
 
  cheers,
  jorrit
 
 Yes exactly (as per my follow up email sent just before yours ;) )
 
 Peter
 
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] DRMAA: TypeError: check_tool_output() takes exactly 5 arguments (4 given)

2012-09-18 Thread Scott McManus

Sorry - that's changeset 7714:3f12146d6d81

-Scott

- Original Message -
 
 Ok - that change was made. The difference is that the change
 is applied to the task instead of the job. It's in changeset
 7713:bfd10aa67c78, and it ran successfully in my environments
 on local, pbs, and drmaa runners. Let me know if there are
 any problems.
 
 Thanks again for your patience.
 
 -Scott
 
 - Original Message -
  On Tue, Sep 18, 2012 at 3:09 PM, Jorrit Boekel
  jorrit.boe...@scilifelab.se wrote:
   Is it possible that you are looking at different classes?
   TaskWrapper's
   finish method does not use the job variable in my recently merged
   code
   either (line ~1045), while JobWrapper's does around line 315.
  
   cheers,
   jorrit
  
  Yes exactly (as per my follow up email sent just before yours ;) )
  
  Peter
  
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
   http://lists.bx.psu.edu/
 
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] drmaa module does not load

2012-03-27 Thread Louise-Amélie Schmitt


Le 27/03/2012 11:03, Louise-Amélie Schmitt a écrit :

Le 26/03/2012 16:13, Nate Coraor a écrit :

On Mar 26, 2012, at 5:11 AM, Louise-Amélie Schmitt wrote:


Hello everyone,

I wanted to start the drmaa job runner and followed the instructions 
in the wiki, but I have this error message when I start Galaxy:


galaxy.jobs ERROR 2012-03-23 15:28:49,845 Job runner is not 
loadable: galaxy.jobs.runners. drmaa

Traceback (most recent call last):
   File /g/funcgen/galaxy/lib/galaxy/jobs/__init__.py, line 1195, 
in _load_plugin

 module = __import__( module_name )
ImportError: No module named  drmaa

I checked /g/funcgen/galaxy/lib/galaxy/jobs/runners and it contains 
the drmaa.py file


There was no drmaa egg so I made a copy of it from our other Galaxy 
install but it didn't solve the problem.


I don't really know where to start looking, any idea?

Thanks,
L-A

Hi L-A,

There's an errant space in the runner name: ' drmaa'.  I am going to 
guess that your start_job_runners looks like:


 start_job_runners = pbs, drmaa

Only the whitespace at the beginning and end of that parameter is 
stripped.  I've committed a fix for this that'll be in the next 
galaxy-dist, but in the meantime, remove the space after the comma 
and the drmaa runner should load.


--nate


Hi Nate,

After a removing the space (and a facepalm) it now fetches the drmaa 
egg and starts the module properly, thanks a lot!


I still have an issue though: When I use the run shell script Galaxy 
crashes right before loading the drmaa runner (right after the pbs 
runner is loaded). The weird thing is that when I launch the command 
manually it works fine:
python ./scripts/paster.py serve universe_wsgi.runner.ini 
--server-name=runner0 --pid-file=runner0.pid --log-file=runner0.log 
--daemon


The other weird thing is that I get no error message at all.

I'll try looking into it but if you have any idea about what's going 
wrong, it would help greatly :)


Thanks again,
L-A
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


Ok it looks like I didn't notice the changes in the start scripts and 
since we use two custom copies of run.sh the code was outdated. I 
corrected that and it now seems to work properly.


Best,
L-A
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/

[galaxy-dev] drmaa module does not load

2012-03-26 Thread Louise-Amélie Schmitt


Hello everyone,

I wanted to start the drmaa job runner and followed the instructions in 
the wiki, but I have this error message when I start Galaxy:


galaxy.jobs ERROR 2012-03-23 15:28:49,845 Job runner is not loadable: 
galaxy.jobs.runners. drmaa

Traceback (most recent call last):
  File /g/funcgen/galaxy/lib/galaxy/jobs/__init__.py, line 1195, in 
_load_plugin

module = __import__( module_name )
ImportError: No module named  drmaa

I checked /g/funcgen/galaxy/lib/galaxy/jobs/runners and it contains the 
drmaa.py file


There was no drmaa egg so I made a copy of it from our other Galaxy 
install but it didn't solve the problem.


I don't really know where to start looking, any idea?

Thanks,
L-A
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] DRMAA error with latest update 26920e20157f

2012-01-30 Thread Shantanu Pavgi


Figured it out. It was an error introduced while resolving version control 
conflicts.  
 
--
Shantanu

On Jan 29, 2012, at 9:12 PM, Shantanu Pavgi wrote:

 
 I am getting following error with the latest galaxy-dist revision 
 '26920e20157f' update.  The Python version is 2.6.6. 
 
 {{{
 galaxy.jobs.runners.drmaa ERROR 2012-01-29 21:00:28,577 Uncaught exception 
 queueing job
 Traceback (most recent call last):
  File /projects/galaxy/galaxy-165/lib/galaxy/jobs/runners/drmaa.py, line 
 140, in run_next
self.queue_job( obj )
  File /projects/galaxy/galaxy-165/lib/galaxy/jobs/runners/drmaa.py, line 
 190, in queue_job
command_line )
 TypeError: not all arguments converted during string formatting
 }}}
 
 I was wondering if anyone else is experiencing this same issue. The system 
 works fine when I rollback to revision 'b258de1e6cea'.  Are there any 
 additional configuration details required with the latest revision that I am 
 missing?? 
 
 --
 Shantanu
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
  http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] DRMAA error with latest update 26920e20157f

2012-01-29 Thread Shantanu Pavgi


I am getting following error with the latest galaxy-dist revision 
'26920e20157f' update.  The Python version is 2.6.6. 

{{{
galaxy.jobs.runners.drmaa ERROR 2012-01-29 21:00:28,577 Uncaught exception 
queueing job
Traceback (most recent call last):
  File /projects/galaxy/galaxy-165/lib/galaxy/jobs/runners/drmaa.py, line 
140, in run_next
self.queue_job( obj )
  File /projects/galaxy/galaxy-165/lib/galaxy/jobs/runners/drmaa.py, line 
190, in queue_job
command_line )
TypeError: not all arguments converted during string formatting
}}}

I was wondering if anyone else is experiencing this same issue. The system 
works fine when I rollback to revision 'b258de1e6cea'.  Are there any 
additional configuration details required with the latest revision that I am 
missing?? 

--
Shantanu
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] DRMAA broken following SGE update. How to fix?

2011-08-23 Thread Chris Cole


BUMP

Does anyone have any idea on this? Our Galaxy is currently out of action 
until this is sorted.

Thanks,

Chris

On 22/08/11 10:08, Chris Cole wrote:

Hi,

Following a recent update to our SGE, DRMAA is failing to load in
galaxy. The reason being that the path has changed. How do I change the
path for galaxy to find the libdrmaa module?
Cheers,

Chris




___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/

Re: [galaxy-dev] DRMAA broken following SGE update. How to fix?

2011-08-23 Thread remy d1

Hi Chris,

Take a look at this file :
lib/galaxy/jobs/runners/drmaa.py
in your galaxy directory.

You can export a path for binaries here. For example :
export PATH=$PATH:/opt/bin/
before export PYTHONPATH


Moreover, do not forget other path values in your service script or in your
galaxy user profile.

Hope this help.



2011/8/23 Roman Valls brainst...@nopcode.org

 Did you adjust the SGE_ROOT environment variable to point to the
 libdrmaa for SGE (probably /opt/sge_62u5_gr/bin/lx24-amd64) ?

 This is of course just a guess, could you please provide some error
 messages/logs ?

 Cheers,
 Roman

 On 2011-08-23 10:20, Chris Cole wrote:
  BUMP
 
  Does anyone have any idea on this? Our Galaxy is currently out of action
  until this is sorted.
  Thanks,
 
  Chris
 
  On 22/08/11 10:08, Chris Cole wrote:
  Hi,
 
  Following a recent update to our SGE, DRMAA is failing to load in
  galaxy. The reason being that the path has changed. How do I change the
  path for galaxy to find the libdrmaa module?
  Cheers,
 
  Chris
 
 
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
 
   http://lists.bx.psu.edu/
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] DRMAA broken following SGE update. How to fix?

2011-08-22 Thread Chris Cole


Hi,

Following a recent update to our SGE, DRMAA is failing to load in 
galaxy. The reason being that the path has changed. How do I change the 
path for galaxy to find the libdrmaa module?

Cheers,

Chris


--
Dr Chris Cole
Senior Research Associate (Bioinformatics)
College of Life Sciences
University of Dundee
Dow Street
Dundee
DD1 5EH
Scotland, UK

url: http://network.nature.com/profile/drchriscole
e-mail: ch...@compbio.dundee.ac.uk
Tel: +44 (0)1382 388 721

The University of Dundee is a registered Scottish charity, No: SC015096
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/

Re: [galaxy-dev] DRMAA options for SGE

2011-07-28 Thread Ka Ming Nip

Hi Ambarish,

Using what I had in my previous message:
mytoolname = drmaa://-w n -l mem_free=1G -l mem_token=1G -l h_vmem=1G/
my jobs do get submitted to the cluster and I see the other options from qstat 
of the jobs
ie. hard resource_list: mem_free=1G,mem_token=1G,h_vmem=1G

Ka Ming

From: ambarish biswas [ambarishbis...@gmail.com]
Sent: July 27, 2011 4:18 PM
To: Ka Ming Nip
Cc: galaxy-dev@lists.bx.psu.edu
Subject: Re: [galaxy-dev] DRMAA options for SGE

Hi Ming,
   just an idea, -w n might be hiding the error reporting. Is your jobs 
getting submitted and executed correctly?

For the queuing , you can add galaxy at the end, makes it

drmaa://-w n -l mem_free=1G -l mem_token=1G -l h_vmem=1G/galaxy

here galaxy is queue name.



With Regards,

Ambarish Biswas,
University of Otago
Department of Biochemistry,
Dunedin, New Zealand,
Tel: +64(22)0855647
Fax: +64(0)3 479 7866




On Thu, Jul 28, 2011 at 6:21 AM, Ka Ming Nip 
km...@bcgsc.camailto:km...@bcgsc.ca wrote:
Answering myself here...

I don't see the error anymore after adding -w n:

[galaxy:tool_runners]
...
mytoolname = drmaa://-w n -l mem_free=1G -l mem_token=1G -l h_vmem=1G/
...

Is this what other Galaxy admins do when using drmaa native options for their 
SGE?

Ka Ming

From: 
galaxy-dev-boun...@lists.bx.psu.edumailto:galaxy-dev-boun...@lists.bx.psu.edu 
[galaxy-dev-boun...@lists.bx.psu.edumailto:galaxy-dev-boun...@lists.bx.psu.edu]
 On Behalf Of Ka Ming Nip [km...@bcgsc.camailto:km...@bcgsc.ca]
Sent: July 26, 2011 10:39 AM
To: galaxy-dev@lists.bx.psu.edumailto:galaxy-dev@lists.bx.psu.edu
Subject: [galaxy-dev] DRMAA options for SGE

Hi,

I am trying to configure the proper memory resource requests for my Galaxy tool.
This is what I have under the tool_runners section of universe_wsgi.ini

[galaxy:tool_runners]
...
mytoolname = drmaa://-l mem_free=1G -l mem_token=1G -l h_vmem=1G/
...

When I execute my tool on Galaxy, I get the error below in the shell that I ran 
sh run.sh:

galaxy.jobs.runners.drmaa ERROR 2011-07-26 09:54:01,930 Uncaught exception 
queueing job
Traceback (most recent call last):
 File /home/kmnip/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, 
line 112, in run_next
   self.queue_job( obj )
 File /home/kmnip/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, 
line 177, in queue_job
   job_id = self.ds.runJob(jt)
 File 
/home/kmnip/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.4.egg/drmaa/__init__.py,
 line 331, in runJob
 File 
/home/kmnip/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.4.egg/drmaa/helpers.py,
 line 213, in c
 File 
/home/kmnip/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.4.egg/drmaa/errors.py, 
line 90, in error_check
DeniedByDrmException: code 17: error: no suitable queues

All the flags I used work with qsub commands on the SGE cluster I use. The tool 
runs
when I comment out the line in universe_wsgi.ini.

Thanks,
Ka Ming
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] DRMAA options for SGE

2011-07-28 Thread ambarish biswas

Hi,
this is what is working for me, the the Output not returned from
cluster error is also disappeared.

default_cluster_job_runner = drmaa://-q galaxy -V/

if this works, then check your other options to test.

With Regards,

Ambarish Biswas,
University of Otago
Department of Biochemistry,
Dunedin, New Zealand,
Tel: +64(22)0855647
Fax: +64(0)3 479 7866




On Fri, Jul 29, 2011 at 6:32 AM, Ka Ming Nip km...@bcgsc.ca wrote:

 Hi Ambarish,

 Using what I had in my previous message:
 mytoolname = drmaa://-w n -l mem_free=1G -l mem_token=1G -l h_vmem=1G/
 my jobs do get submitted to the cluster and I see the other options from
 qstat of the jobs
 ie. hard resource_list: mem_free=1G,mem_token=1G,h_vmem=1G

 Ka Ming
 
 From: ambarish biswas [ambarishbis...@gmail.com]
 Sent: July 27, 2011 4:18 PM
 To: Ka Ming Nip
 Cc: galaxy-dev@lists.bx.psu.edu
 Subject: Re: [galaxy-dev] DRMAA options for SGE

 Hi Ming,
   just an idea, -w n might be hiding the error reporting. Is your jobs
 getting submitted and executed correctly?

 For the queuing , you can add galaxy at the end, makes it

 drmaa://-w n -l mem_free=1G -l mem_token=1G -l h_vmem=1G/galaxy

 here galaxy is queue name.



 With Regards,
 
 Ambarish Biswas,
 University of Otago
 Department of Biochemistry,
 Dunedin, New Zealand,
 Tel: +64(22)0855647
 Fax: +64(0)3 479 7866




 On Thu, Jul 28, 2011 at 6:21 AM, Ka Ming Nip km...@bcgsc.camailto:
 km...@bcgsc.ca wrote:
 Answering myself here...

 I don't see the error anymore after adding -w n:

 [galaxy:tool_runners]
 ...
 mytoolname = drmaa://-w n -l mem_free=1G -l mem_token=1G -l h_vmem=1G/
 ...

 Is this what other Galaxy admins do when using drmaa native options for
 their SGE?

 Ka Ming
 
 From: galaxy-dev-boun...@lists.bx.psu.edumailto:
 galaxy-dev-boun...@lists.bx.psu.edu [galaxy-dev-boun...@lists.bx.psu.edu
 mailto:galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of Ka Ming Nip [
 km...@bcgsc.camailto:km...@bcgsc.ca]
 Sent: July 26, 2011 10:39 AM
 To: galaxy-dev@lists.bx.psu.edumailto:galaxy-dev@lists.bx.psu.edu
 Subject: [galaxy-dev] DRMAA options for SGE

 Hi,

 I am trying to configure the proper memory resource requests for my Galaxy
 tool.
 This is what I have under the tool_runners section of universe_wsgi.ini

 [galaxy:tool_runners]
 ...
 mytoolname = drmaa://-l mem_free=1G -l mem_token=1G -l h_vmem=1G/
 ...

 When I execute my tool on Galaxy, I get the error below in the shell that I
 ran sh run.sh:

 galaxy.jobs.runners.drmaa ERROR 2011-07-26 09:54:01,930 Uncaught exception
 queueing job
 Traceback (most recent call last):
  File /home/kmnip/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py,
 line 112, in run_next
   self.queue_job( obj )
  File /home/kmnip/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py,
 line 177, in queue_job
   job_id = self.ds.runJob(jt)
  File
 /home/kmnip/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.4.egg/drmaa/__init__.py,
 line 331, in runJob
  File
 /home/kmnip/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.4.egg/drmaa/helpers.py,
 line 213, in c
  File
 /home/kmnip/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.4.egg/drmaa/errors.py,
 line 90, in error_check
 DeniedByDrmException: code 17: error: no suitable queues

 All the flags I used work with qsub commands on the SGE cluster I use. The
 tool runs
 when I comment out the line in universe_wsgi.ini.

 Thanks,
 Ka Ming
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] DRMAA options for SGE

2011-07-27 Thread Ka Ming Nip

Answering myself here...

I don't see the error anymore after adding -w n:

[galaxy:tool_runners]
...
mytoolname = drmaa://-w n -l mem_free=1G -l mem_token=1G -l h_vmem=1G/
...

Is this what other Galaxy admins do when using drmaa native options for their 
SGE?

Ka Ming

From: galaxy-dev-boun...@lists.bx.psu.edu [galaxy-dev-boun...@lists.bx.psu.edu] 
On Behalf Of Ka Ming Nip [km...@bcgsc.ca]
Sent: July 26, 2011 10:39 AM
To: galaxy-dev@lists.bx.psu.edu
Subject: [galaxy-dev] DRMAA options for SGE

Hi,

I am trying to configure the proper memory resource requests for my Galaxy tool.
This is what I have under the tool_runners section of universe_wsgi.ini

[galaxy:tool_runners]
...
mytoolname = drmaa://-l mem_free=1G -l mem_token=1G -l h_vmem=1G/
...

When I execute my tool on Galaxy, I get the error below in the shell that I ran 
sh run.sh:

galaxy.jobs.runners.drmaa ERROR 2011-07-26 09:54:01,930 Uncaught exception 
queueing job
Traceback (most recent call last):
  File /home/kmnip/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, 
line 112, in run_next
self.queue_job( obj )
  File /home/kmnip/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, 
line 177, in queue_job
job_id = self.ds.runJob(jt)
  File 
/home/kmnip/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.4.egg/drmaa/__init__.py,
 line 331, in runJob
  File 
/home/kmnip/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.4.egg/drmaa/helpers.py,
 line 213, in c
  File 
/home/kmnip/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.4.egg/drmaa/errors.py, 
line 90, in error_check
DeniedByDrmException: code 17: error: no suitable queues

All the flags I used work with qsub commands on the SGE cluster I use. The tool 
runs
when I comment out the line in universe_wsgi.ini.

Thanks,
Ka Ming
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] DRMAA options for SGE

2011-07-26 Thread Ka Ming Nip

Hi,

I am trying to configure the proper memory resource requests for my Galaxy tool.
This is what I have under the tool_runners section of universe_wsgi.ini

[galaxy:tool_runners]
...
mytoolname = drmaa://-l mem_free=1G -l mem_token=1G -l h_vmem=1G/
...

When I execute my tool on Galaxy, I get the error below in the shell that I ran 
sh run.sh:

galaxy.jobs.runners.drmaa ERROR 2011-07-26 09:54:01,930 Uncaught exception 
queueing job
Traceback (most recent call last):
  File /home/kmnip/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, 
line 112, in run_next
self.queue_job( obj )
  File /home/kmnip/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, 
line 177, in queue_job
job_id = self.ds.runJob(jt)
  File 
/home/kmnip/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.4.egg/drmaa/__init__.py,
 line 331, in runJob
  File 
/home/kmnip/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.4.egg/drmaa/helpers.py,
 line 213, in c
  File 
/home/kmnip/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.4.egg/drmaa/errors.py, 
line 90, in error_check
DeniedByDrmException: code 17: error: no suitable queues

All the flags I used work with qsub commands on the SGE cluster I use. The tool 
runs
when I comment out the line in universe_wsgi.ini.

Thanks,
Ka Ming
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] drmaa://native/ : native options are ignored

2011-06-20 Thread Geert Vandeweyer


Hi,

I'm working on an local installation of galaxy using torque with drmaa 
(the pbs-torque scramble failed). The torque-drmaa works fine so far, 
except for one issue.


I'd like to specify some tool-dependent requirements from the 
tool_runners section in universe.wsgi.ini. For now I've been testing it 
with the setting below to have global native arguments :


default_cluster_job_runner =  drmaa://-l mem=4gb:nodes=1:ppn=6/

This should request 4gb of memory on a single node with 6 threads, but 
these requests are ignored. They are not listed on 'qstat -R' and more 
simultaneous jobs than possible are started if the requirements were 
taken into account. What am I missing here?


Best regards,

Geert Vandeweyer

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/

Re: [galaxy-dev] drmaa://native/ : native options are ignored

2011-06-20 Thread Marina Gourtovaia


Hi
default_cluster_job_runner = drmaa://-q srpipeline -P pipeline/
works for me on LSF, so your syntax seems to be correct.

Assuming that -l mem=4gb:nodes=1:ppn=6 works teh way you expect when you 
start the jobs on your cluster from the shell, read on...


Bearing in mind the the value of the option contains ':' , I'd check 
that it is read correctly by the parser that parses universe_wsgi.ini 
and it is passed correctly to drmaa. If it's passed to drmaa correctly, 
but does not produced desired effect, I'd look closely at the drmaa 
library. This might be a bug. It's possible to use the drmaa library 
from a C script - this way you can test if drmaa works the way you want. 
Also, It was rather straightforward to generate a perl swig wrapper for 
drmaa and write perl test scripts. Swig wrappers can be generated for 
any scripting language and also java.


Marina

On 20/06/2011 10:53, Geert Vandeweyer wrote:

Hi,

I'm working on an local installation of galaxy using torque with drmaa 
(the pbs-torque scramble failed). The torque-drmaa works fine so far, 
except for one issue.


I'd like to specify some tool-dependent requirements from the 
tool_runners section in universe.wsgi.ini. For now I've been testing 
it with the setting below to have global native arguments :


default_cluster_job_runner =  drmaa://-l mem=4gb:nodes=1:ppn=6/

This should request 4gb of memory on a single node with 6 threads, but 
these requests are ignored. They are not listed on 'qstat -R' and more 
simultaneous jobs than possible are started if the requirements were 
taken into account. What am I missing here?


Best regards,

Geert Vandeweyer

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/



--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE. 
___

Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/

49 matches

Mail list logo