[galaxy-dev] DRMAA Slurm error
Hi, I have configured a new galaxy-project site with SLURM (version 14). I have one server with a Galaxy-Project instance, one node with SLURM server and two SLURM worker nodes. I have compile SLURM-DRMAA from source codes. When I run “drmaa-run /bin/hostname” it’s work. But, when I try to run the server I got the next error: Traceback (most recent call last): File /home/galaxy-dist/lib/galaxy/webapps/galaxy/buildapp.py, line 39, in app_factory app = UniverseApplication( global_conf = global_conf, **kwargs ) File /home/galaxy-dist/lib/galaxy/app.py, line 141, in __init__ self.job_manager = manager.JobManager( self ) File /home/galaxy-dist/lib/galaxy/jobs/manager.py, line 23, in __init__ self.job_handler = handler.JobHandler( app ) File /home/galaxy-dist/lib/galaxy/jobs/handler.py, line 32, in __init__ self.dispatcher = DefaultJobDispatcher( app ) File /home/galaxy-dist/lib/galaxy/jobs/handler.py, line 704, in __init__ self.job_runners = self.app.job_config.get_job_runner_plugins( self.app.config.server_name ) File /home/galaxy-dist/lib/galaxy/jobs/__init__.py, line 621, in get_job_runner_plugins rval[id] = runner_class( self.app, runner[ 'workers' ], **runner.get( 'kwds', {} ) ) File /home/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py, line 81, in __init__ self.ds.initialize() File /home/galaxy-dist/eggs/drmaa-0.7.6-py2.6.egg/drmaa/session.py, line 257, in initialize py_drmaa_init(contactString) File /home/galaxy-dist/eggs/drmaa-0.7.6-py2.6.egg/drmaa/wrappers.py, line 73, in py_drmaa_init return _lib.drmaa_init(contact, error_buffer, sizeof(error_buffer)) File /home/galaxy-dist/eggs/drmaa-0.7.6-py2.6.egg/drmaa/errors.py, line 151, in error_check raise _ERRORS[code - 1](error_string) AlreadyActiveSessionException: code 11: DRMAA session already exist. [root@galaxy-project galaxy-dist]# This is my “job_conf.xml”: job_conf plugins workers=4 plugin id=local type=runner load=galaxy.jobs.runners.local:LocalJobRunner/ plugin id=drmaa type=runner load=galaxy.jobs.runners.drmaa:DRMAAJobRunner/ plugin id=cli type=runner load=galaxy.jobs.runners.cli:ShellJobRunner / plugin id=slurm type=runner load=galaxy.jobs.runners.slurm:SlurmJobRunner param id=drmaa_library_path/usr/local/lib/libdrmaa.so/param /plugin /plugins handlers handler id=main/ /handlers destinations default=drmaa_slurm destination id=local runner=local/ destination id=multicore_local runner=local param id=local_slots4/param param id=embed_metadata_in_jobTrue/param job_metrics / /destination destination id=docker_local runner=local param id=docker_enabledtrue/param /destination destination id=drmaa_slurm runner=drmaa param id=galaxy_external_runjob_scriptscripts/drmaa_external_runner.py/param param id=galaxy_external_killjob_scriptscripts/drmaa_external_killer.py/param param id=galaxy_external_chown_scriptscripts/external_chown_script.py/param /destination destination id=direct_slurm runner=slurm param id=nativeSpecification--time=00:01:00/param /destination /destinations resources default=default group id=default/group group id=memoryonlymemory/group group id=allprocessors,memory,time,project/group /resources tools tool id=foo handler=trackster_handler param id=sourcetrackster/param /tool tool id=bar destination=dynamic/ tool id=longbar destination=dynamic resources=all / tool id=baz handler=special_handlers destination=bigmem/ /tools limits limit type=registered_user_concurrent_jobs2/limit limit type=anonymous_user_concurrent_jobs1/limit limit type=destination_user_concurrent_jobs id=local1/limit limit type=destination_user_concurrent_jobs tag=mycluster2/limit limit type=destination_user_concurrent_jobs tag=longjobs1/limit limit type=destination_total_concurrent_jobs id=local16/limit limit type=destination_total_concurrent_jobs tag=longjobs100/limit limit type=walltime24:00:00/limit limit type=output_size10GB/limit /limits /job_conf Can you help me? I am newbie with “Galaxy Project” administration. THANKS IN ADVANCE Alfonso Pardo Diaz System Administrator / Researcher c/ Sola nº 1; 10200 Trujillo, ESPAÑA Tel: +34 927 65 93 17 Fax: +34 927 32 32 37 [CETA-Ciemat logo]http://www.ceta-ciemat.es/ Confidencialidad: Este mensaje y sus ficheros adjuntos se dirige exclusivamente a su destinatario y puede contener información privilegiada o confidencial. Si no es vd. el destinatario indicado, queda notificado de que la utilización, divulgación y/o copia sin autorización está prohibida en virtud de la legislación vigente. Si ha
Re: [galaxy-dev] DRMAA Slurm error
Solved! The problem was I have configured the job_conf.xml two plugins entry with DRMAA. I have deleted the entry: plugin id=slurm type=runner load=galaxy.jobs.runners.slurm:SlurmJobRunner And now works! Thanks El 25/09/2014, a las 08:12, Pardo Diaz, Alfonso alfonso.pa...@ciemat.esmailto:alfonso.pa...@ciemat.es escribió: Hi, I have configured a new galaxy-project site with SLURM (version 14). I have one server with a Galaxy-Project instance, one node with SLURM server and two SLURM worker nodes. I have compile SLURM-DRMAA from source codes. When I run “drmaa-run /bin/hostname” it’s work. But, when I try to run the server I got the next error: Traceback (most recent call last): File /home/galaxy-dist/lib/galaxy/webapps/galaxy/buildapp.py, line 39, in app_factory app = UniverseApplication( global_conf = global_conf, **kwargs ) File /home/galaxy-dist/lib/galaxy/app.py, line 141, in __init__ self.job_manager = manager.JobManager( self ) File /home/galaxy-dist/lib/galaxy/jobs/manager.py, line 23, in __init__ self.job_handler = handler.JobHandler( app ) File /home/galaxy-dist/lib/galaxy/jobs/handler.py, line 32, in __init__ self.dispatcher = DefaultJobDispatcher( app ) File /home/galaxy-dist/lib/galaxy/jobs/handler.py, line 704, in __init__ self.job_runners = self.app.job_config.get_job_runner_plugins( self.app.config.server_name ) File /home/galaxy-dist/lib/galaxy/jobs/__init__.py, line 621, in get_job_runner_plugins rval[id] = runner_class( self.app, runner[ 'workers' ], **runner.get( 'kwds', {} ) ) File /home/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py, line 81, in __init__ self.ds.initialize() File /home/galaxy-dist/eggs/drmaa-0.7.6-py2.6.egg/drmaa/session.py, line 257, in initialize py_drmaa_init(contactString) File /home/galaxy-dist/eggs/drmaa-0.7.6-py2.6.egg/drmaa/wrappers.py, line 73, in py_drmaa_init return _lib.drmaa_init(contact, error_buffer, sizeof(error_buffer)) File /home/galaxy-dist/eggs/drmaa-0.7.6-py2.6.egg/drmaa/errors.py, line 151, in error_check raise _ERRORS[code - 1](error_string) AlreadyActiveSessionException: code 11: DRMAA session already exist. [root@galaxy-project galaxy-dist]# This is my “job_conf.xml”: job_conf plugins workers=4 plugin id=local type=runner load=galaxy.jobs.runners.local:LocalJobRunner/ plugin id=drmaa type=runner load=galaxy.jobs.runners.drmaa:DRMAAJobRunner/ plugin id=cli type=runner load=galaxy.jobs.runners.cli:ShellJobRunner / plugin id=slurm type=runner load=galaxy.jobs.runners.slurm:SlurmJobRunner param id=drmaa_library_path/usr/local/lib/libdrmaa.so/param /plugin /plugins handlers handler id=main/ /handlers destinations default=drmaa_slurm destination id=local runner=local/ destination id=multicore_local runner=local param id=local_slots4/param param id=embed_metadata_in_jobTrue/param job_metrics / /destination destination id=docker_local runner=local param id=docker_enabledtrue/param /destination destination id=drmaa_slurm runner=drmaa param id=galaxy_external_runjob_scriptscripts/drmaa_external_runner.py/param param id=galaxy_external_killjob_scriptscripts/drmaa_external_killer.py/param param id=galaxy_external_chown_scriptscripts/external_chown_script.py/param /destination destination id=direct_slurm runner=slurm param id=nativeSpecification--time=00:01:00/param /destination /destinations resources default=default group id=default/group group id=memoryonlymemory/group group id=allprocessors,memory,time,project/group /resources tools tool id=foo handler=trackster_handler param id=sourcetrackster/param /tool tool id=bar destination=dynamic/ tool id=longbar destination=dynamic resources=all / tool id=baz handler=special_handlers destination=bigmem/ /tools limits limit type=registered_user_concurrent_jobs2/limit limit type=anonymous_user_concurrent_jobs1/limit limit type=destination_user_concurrent_jobs id=local1/limit limit type=destination_user_concurrent_jobs tag=mycluster2/limit limit type=destination_user_concurrent_jobs tag=longjobs1/limit limit type=destination_total_concurrent_jobs id=local16/limit limit type=destination_total_concurrent_jobs tag=longjobs100/limit limit type=walltime24:00:00/limit limit type=output_size10GB/limit /limits /job_conf Can you help me? I am newbie with “Galaxy Project” administration. THANKS IN ADVANCE Alfonso Pardo Diaz System Administrator / Researcher c/ Sola nº 1; 10200 Trujillo, ESPAÑA Tel: +34 927 65 93 17 Fax: +34 927 32 32 37 [CETA-Ciemat logo]http://www.ceta-ciemat.es/
[galaxy-dev] drmaa
Hi all, I've configured galaxy with the drama python module. This is really more of a drama question... We have no default queue set in moab and I can't seem to find a way to specify a queue in the docs I've been looking at here - http://drmaa-python.readthedocs.org/en/latest/tutorials.html I'd like to be able to specify a queue based on various pieces of logic in my destinations.py script. Any suggestions would be appreciated. Donny FSU Research Computing Center ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] drmaa
Hi Donny, You should be able to specify the queue using the nativeSpecification field of drmaa requests, e.g. in your job_conf.xml: destination id=batch runner=pbs_drmaa param id=nativeSpecification-q batch/param /destination Documentation on job_conf.xml's syntax by runner can be found here: https://wiki.galaxyproject.org/Admin/Config/Performance/Cluster --nate On Tue, May 6, 2014 at 5:53 PM, Shrum, Donald C dcsh...@admin.fsu.eduwrote: Hi all, I've configured galaxy with the drama python module. This is really more of a drama question... We have no default queue set in moab and I can't seem to find a way to specify a queue in the docs I've been looking at here - http://drmaa-python.readthedocs.org/en/latest/tutorials.html I'd like to be able to specify a queue based on various pieces of logic in my destinations.py script. Any suggestions would be appreciated. Donny FSU Research Computing Center ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] DRMAA configuring issue
I feel like someone should respond to this but I must admin I don't have a lot of ideas. I assume you are able to use qsub to submit jobs from the Galaxy server? This is worth verifying that before anything else. If that doesn't work - the system configuration needs to be modified. I think there are a couple different implementations of DRMAA for PBS: http://apps.man.poznan.pl/trac/pbs-drmaa (I think this is recommend one). http://sourceforge.net/projects/pbspro-drmaa/ It might be worth trying to compile the latest and great of one or both and target both. Galaxy also has a PBS runner that many people use for communicating with Torque. I think the DRMAA runner should work - but this is a fallback option as well just to get going. -John ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] DRMAA configuring issue
Does anyone have any tips about this, please :)? Regards From: galaxy-dev-boun...@lists.bx.psu.edu [mailto:galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of Hakeem Almabrazi Sent: Monday, April 21, 2014 3:49 PM To: galaxy-dev@lists.bx.psu.edu Subject: [galaxy-dev] DRMAA configuring issue Hi, I am trying to get DRMAA runner working for my local galaxy cluster. However, I am having hard time configuring it my system. So far, I have installed Torque 2.5.12 and it seems to work as expected. I installed drmaa_1.0.17 and here is DRMAA_LIBRARY_PATH galaxy_env)galaxy@GalaxyTest01[/home/galaxy/galaxy-dist]$ echo $DRMAA_LIBRARY_PATH /usr/local/lib/libdrmaa.so My job.conf.xml ?xml version=1.0? !-- A sample job config that explicitly configures job running the way it is configured by default (if there is no explicit config). -- job_conf plugins plugin id=sge type=runner load=galaxy.jobs.runners.drmaa:DRMAAJobRunner workers=4/ /plugins handlers default=handlers handler id=main tags=handlers/ /handlers destinations default=sge_default destination id=sge_default runner=drmaa/ /destinations /job_conf This is the error I am getting when I start galaxy. galaxy.jobs INFO 2014-04-21 15:37:30,730 Handler 'main' will load all configured runner plugins Traceback (most recent call last): File /home/galaxy/galaxy-dist/lib/galaxy/webapps/galaxy/buildapp.py, line 39, in app_factory app = UniverseApplication( global_conf = global_conf, **kwargs ) File /home/galaxy/galaxy-dist/lib/galaxy/app.py, line 130, in __init__ self.job_manager = manager.JobManager( self ) File /home/galaxy/galaxy-dist/lib/galaxy/jobs/manager.py, line 31, in __init__ self.job_handler = handler.JobHandler( app ) File /home/galaxy/galaxy-dist/lib/galaxy/jobs/handler.py, line 30, in __init__ self.dispatcher = DefaultJobDispatcher( app ) File /home/galaxy/galaxy-dist/lib/galaxy/jobs/handler.py, line 568, in __init__ self.job_runners = self.app.job_config.get_job_runner_plugins( self.app.config.server_name ) File /home/galaxy/galaxy-dist/lib/galaxy/jobs/__init__.py, line 489, in get_job_runner_plugins rval[id] = runner_class( self.app, runner[ 'workers' ], **runner.get( 'kwds', {} ) ) File /home/galaxy/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py, line 81, in __init__ self.ds.initialize() File /home/galaxy/galaxy-dist/eggs/drmaa-0.6-py2.6.egg/drmaa/__init__.py, line 274, in initialize _w.init(contactString) File /home/galaxy/galaxy-dist/eggs/drmaa-0.6-py2.6.egg/drmaa/wrappers.py, line 59, in init return _lib.drmaa_init(contact, error_buffer, sizeof(error_buffer)) File /home/galaxy/galaxy-dist/eggs/drmaa-0.6-py2.6.egg/drmaa/errors.py, line 90, in error_check raise _ERRORS[code-1](code %s: %s % (code, error_buffer.value)) DrmCommunicationException: code 2: (null) Removing PID file paster.pid I am not sure what is the issue here or how to go about resolving it. I will really appreciate it if someone can tell me how to debug it? Best regards Hak ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] DRMAA configuring issue
Hi, I am trying to get DRMAA runner working for my local galaxy cluster. However, I am having hard time configuring it my system. So far, I have installed Torque 2.5.12 and it seems to work as expected. I installed drmaa_1.0.17 and here is DRMAA_LIBRARY_PATH galaxy_env)galaxy@GalaxyTest01[/home/galaxy/galaxy-dist]$ echo $DRMAA_LIBRARY_PATH /usr/local/lib/libdrmaa.so My job.conf.xml ?xml version=1.0? !-- A sample job config that explicitly configures job running the way it is configured by default (if there is no explicit config). -- job_conf plugins plugin id=sge type=runner load=galaxy.jobs.runners.drmaa:DRMAAJobRunner workers=4/ /plugins handlers default=handlers handler id=main tags=handlers/ /handlers destinations default=sge_default destination id=sge_default runner=drmaa/ /destinations /job_conf This is the error I am getting when I start galaxy. galaxy.jobs INFO 2014-04-21 15:37:30,730 Handler 'main' will load all configured runner plugins Traceback (most recent call last): File /home/galaxy/galaxy-dist/lib/galaxy/webapps/galaxy/buildapp.py, line 39, in app_factory app = UniverseApplication( global_conf = global_conf, **kwargs ) File /home/galaxy/galaxy-dist/lib/galaxy/app.py, line 130, in __init__ self.job_manager = manager.JobManager( self ) File /home/galaxy/galaxy-dist/lib/galaxy/jobs/manager.py, line 31, in __init__ self.job_handler = handler.JobHandler( app ) File /home/galaxy/galaxy-dist/lib/galaxy/jobs/handler.py, line 30, in __init__ self.dispatcher = DefaultJobDispatcher( app ) File /home/galaxy/galaxy-dist/lib/galaxy/jobs/handler.py, line 568, in __init__ self.job_runners = self.app.job_config.get_job_runner_plugins( self.app.config.server_name ) File /home/galaxy/galaxy-dist/lib/galaxy/jobs/__init__.py, line 489, in get_job_runner_plugins rval[id] = runner_class( self.app, runner[ 'workers' ], **runner.get( 'kwds', {} ) ) File /home/galaxy/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py, line 81, in __init__ self.ds.initialize() File /home/galaxy/galaxy-dist/eggs/drmaa-0.6-py2.6.egg/drmaa/__init__.py, line 274, in initialize _w.init(contactString) File /home/galaxy/galaxy-dist/eggs/drmaa-0.6-py2.6.egg/drmaa/wrappers.py, line 59, in init return _lib.drmaa_init(contact, error_buffer, sizeof(error_buffer)) File /home/galaxy/galaxy-dist/eggs/drmaa-0.6-py2.6.egg/drmaa/errors.py, line 90, in error_check raise _ERRORS[code-1](code %s: %s % (code, error_buffer.value)) DrmCommunicationException: code 2: (null) Removing PID file paster.pid I am not sure what is the issue here or how to go about resolving it. I will really appreciate it if someone can tell me how to debug it? Best regards Hak ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] DRMAA bug in galaxy-central
Hi, since a few days there are frequent error messages popping up in my Galaxy log files. galaxy.jobs.runners ERROR 2014-01-17 13:11:17,094 Unhandled exception checking active jobs Traceback (most recent call last): File /usr/local/galaxy/galaxy-dist/lib/galaxy/jobs/runners/__init__.py, line 358, in monitor self.check_watched_items() File /usr/local/galaxy/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py, line 238, in check_watched_items if self.runner_params[ retry_param ] 0: TypeError: 'RunnerParams' object has no attribute '__getitem__' Cheers, Bjoern ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] DRMAA bug in galaxy-central
It is hard to test error states... but I assume you have the setup for it :). Any chance you can apply these patches and let me know if they fix the problem? I assume they will. https://bitbucket.org/galaxy/galaxy-central/pull-request/300/potential-drmaa-fixes -John On Fri, Jan 17, 2014 at 6:15 AM, Bjoern Gruening bjoern.gruen...@gmail.com wrote: Hi, since a few days there are frequent error messages popping up in my Galaxy log files. galaxy.jobs.runners ERROR 2014-01-17 13:11:17,094 Unhandled exception checking active jobs Traceback (most recent call last): File /usr/local/galaxy/galaxy-dist/lib/galaxy/jobs/runners/__init__.py, line 358, in monitor self.check_watched_items() File /usr/local/galaxy/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py, line 238, in check_watched_items if self.runner_params[ retry_param ] 0: TypeError: 'RunnerParams' object has no attribute '__getitem__' Cheers, Bjoern ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] DRMAA bug in galaxy-central
Thanks John, that fixed it for me! Have a nice weekend, Bjoern It is hard to test error states... but I assume you have the setup for it :). Any chance you can apply these patches and let me know if they fix the problem? I assume they will. https://bitbucket.org/galaxy/galaxy-central/pull-request/300/potential-drmaa-fixes -John On Fri, Jan 17, 2014 at 6:15 AM, Bjoern Gruening bjoern.gruen...@gmail.com wrote: Hi, since a few days there are frequent error messages popping up in my Galaxy log files. galaxy.jobs.runners ERROR 2014-01-17 13:11:17,094 Unhandled exception checking active jobs Traceback (most recent call last): File /usr/local/galaxy/galaxy-dist/lib/galaxy/jobs/runners/__init__.py, line 358, in monitor self.check_watched_items() File /usr/local/galaxy/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py, line 238, in check_watched_items if self.runner_params[ retry_param ] 0: TypeError: 'RunnerParams' object has no attribute '__getitem__' Cheers, Bjoern ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] DRMAA/SGE job handling regression?
Hello all, On our main Galaxy tracking galaxy-dist using DRMAA/SGE, jobs submitted to the cluster and queued and waiting (qw) are correctly shown in Galaxy as grey pending entries in the history. With my test instance tracking galaxy-central (along with a new visual look and new icons), such jobs are wrongly shown as yellow (running). Is this a general regression affecting other people? There also seem to be issues where killing a job in Galaxy just hides it but it remains running (yellow once you tick show deleted datasets, and running on SGE too). This was working properly on galaxy-dist (the job was killed on the cluster, and shown as red if you ticked show deleted datasets). Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] DRMAA Runner URL Specify TORQUE host
I've had success using the pbs runner rather than drmaa runner for this case. It's quite straightforward to specify the pbs_server for the pbs runner. Works just as the documentation indicates. - Bart On Tue, Jul 9, 2013 at 1:46 PM, Bart Gottschalk bgott...@umn.edu wrote: I haven't been able to find a way to make the drmaa runner work in this situation. I'm going to move on to trying this with a pbs runner instead. I will post to this thread if this works for me. - Bart ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] DRMAA Runner URL Specify TORQUE host
I haven't been able to find a way to make the drmaa runner work in this situation. I'm going to move on to trying this with a pbs runner instead. I will post to this thread if this works for me. - Bart ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] DRMAA Runner URL Specify TORQUE host
Is it possible to specify the torque host as part of a DRMAA runner URL? I haven't been able to find a *native_options *parameter to allow for this. I'm using the old style cluster configuration. *drmaa://[native_options]/* * * Also, I haven't been able to find a list of native_options anywhere. Does anyone have a link to a comprehensive list? - Bart ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] DRMAA Runner URL Specify TORQUE host
Is it possible to specify the torque host as part of a DRMAA runner URL? I haven't been able to find a *native_options *parameter to allow for this. I'm using the old style cluster configuration. *drmaa://[native_options]/* * * Also, I haven't been able to find a list of native_options anywhere. Does such a list exist? If so, where? ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] DRMAA Runner URL Specify TORQUE host
Bart, I believe drmaa://-q somehost@queue-name will work. However I could be very wrong. It has been a while since I messed with the actual drmaa runners. -- Adam Brenner Computer Science, Undergraduate Student Donald Bren School of Information and Computer Sciences Research Computing Support Office of Information Technology http://www.oit.uci.edu/rcs/ University of California, Irvine www.ics.uci.edu/~aebrenne/ aebre...@uci.edu On Wed, Jun 26, 2013 at 1:28 PM, Bart Gottschalk bgott...@umn.edu wrote: Is it possible to specify the torque host as part of a DRMAA runner URL? I haven't been able to find a native_options parameter to allow for this. I'm using the old style cluster configuration. drmaa://[native_options]/ Also, I haven't been able to find a list of native_options anywhere. Does such a list exist? If so, where? ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] drmaa and JSV
Hi, Our galaxy instance runs jobs in a SGE cluster using 2 job-handlers. The SGE cluster uses a Job Submission Verifier (JSV) that rejects any job submission that specify core binding strategies. When Galaxy starts, the first jobs we submit works perfectly: First job submission: galaxy.jobs.manager DEBUG 2013-04-15 14:29:59,285 (194) Job assigned to handler 'handler0' galaxy.jobs DEBUG 2013-04-15 14:29:59,934 (194) Working directory for job is: /scratch/nfs/galaxy.crg.es/job_working_directory/000/194 galaxy.jobs.handler DEBUG 2013-04-15 14:29:59,942 dispatching job 194 to drmaa runner galaxy.jobs.handler INFO 2013-04-15 14:30:00,166 (194) Job dispatched galaxy.jobs.runners.drmaa DEBUG 2013-04-15 14:30:00,468 (194) submitting file /scratch/nfs/galaxy.crg.es/ogs/galaxy_194.sh galaxy.jobs.runners.drmaa DEBUG 2013-04-15 14:30:00,468 (194) command is: python /data/www-bi/apache/galaxy.crg.es/htdocs/galaxy-dist/tools/fastq/fastq_stats.py '/data/www-bi/galaxy.crg.es/files/000/dataset_4.dat' '/data/www-bi/galaxy.crg.es/files/000/dataset_238.dat' 'sanger' galaxy.jobs.runners.drmaa INFO 2013-04-15 14:30:01,538 (194) queued as 458816 galaxy.jobs.runners.drmaa DEBUG 2013-04-15 14:30:02,115 (194/458816) state change: job is queued and active # qstat -cb -j 458816 == job_number: 458816 exec_file: job_scripts/458816 submission_time:Mon Apr 15 14:30:01 2013 owner: www-bi uid:66401 group: www-bi gid:501 sge_o_home: /data/www-bi sge_o_log_name: www-bi sge_o_path: /data/galaxy/apache/galaxy.crg.es/htdocs/scripts/galaxy-env/bin:/software/galaxy/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/data/www-bi/bin sge_o_shell:/bin/bash sge_o_workdir: /data/www-bi/apache/galaxy.crg.es/htdocs/galaxy-dist sge_o_host: galaxy account:sge stderr_path_list: NONE:galaxy:/scratch/nfs/galaxy.crg.es/job_working_directory/000/194/194.drmerr reserve:y hard resource_list: virtual_free=12G,h_rt=21600 mail_list: www...@galaxy.crg.es notify: FALSE job_name: g194_fastq_stats_jtaly_crg_es stdout_path_list: NONE:galaxy:/scratch/nfs/galaxy.crg.es/job_working_directory/000/194/194.drmout jobshare: 0 hard_queue_list:www-el6 env_list: script_file:/scratch/nfs/galaxy.crg.es/ogs/galaxy_194.sh parallel environment: smp range: 2 verify_suitable_queues: 2 binding:set linear:2:0,0 scheduling info:queue instance pr-...@fenn.linux.crg.es dropped because it is overloaded: np_load_avg=1.70 (= 1.70 + 0.50 * 0.00 with nproc=12) = 1.7 queue instance sh...@node-ib0209bi.linux.crg.es dropped because it is overloaded: np_load_avg=2.837500 (= 2.837500 + 0.50 * 0.00 with nproc=8) = 1.3 queue instance l...@node-ib0209bi.linux.crg.es dropped because it is overloaded: np_load_avg=2.837500 (= 2.837500 + 0.50 * 0.00 with nproc=8) = 1.3 The core binding has been added by our jsv script. This is correct. But our second submission fails: galaxy.jobs.runners.drmaa ERROR 2013-04-15 14:30:56,263 Uncaught exception queueing job Traceback (most recent call last): File /data/www-bi/apache/galaxy.crg.es/htdocs/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py, line 144, in run_next self.queue_job( obj ) File /data/www-bi/apache/galaxy.crg.es/htdocs/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py, line 232, in queue_job job_id = self.ds.runJob(jt) File /data/www-bi/apache/galaxy.crg.es/htdocs/galaxy-dist/eggs/drmaa-0.4b3-py2.6.egg/drmaa/__init__.py, line 331, in runJob _h.c(_w.drmaa_run_job, jid, _ct.sizeof(jid), jobTemplate) File /data/www-bi/apache/galaxy.crg.es/htdocs/galaxy-dist/eggs/drmaa-0.4b3-py2.6.egg/drmaa/helpers.py, line 213, in c return f(*(args + (error_buffer, sizeof(error_buffer File /data/www-bi/apache/galaxy.crg.es/htdocs/galaxy-dist/eggs/drmaa-0.4b3-py2.6.egg/drmaa/errors.py, line 90, in error_check raise _ERRORS[code-1](code %s: %s % (code, error_buffer.value)) DeniedByDrmException: code 17: contact us: x...@xxx.es if we look at the submited params: # cat /tmp/qsub_err.txt $VAR1 = { 'w' = 'e', 'N' = 'g195_fastq_stats_jtaly_crg_es', 'binding_amount' = '2', 'CMDNAME' = '/scratch/nfs/galaxy.crg.es/ogs/galaxy_195.sh', 'binding_type' = 'set', 'M' = { 'www...@galaxy.crg.es' = undef }, 'binding_strategy' = 'linear', 'l_hard' = { 'virtual_free' = '12G',
Re: [galaxy-dev] DRMAA runner weirdness
Hello, Can you please post the link to this patch? I do not see it in the mail thread and I too have noticed some issues with the DRMAA job running since updating to the Oct. 23rd distribution. I don't know if it is related yet but I'd like to try the patch to see. I have two local instances of Galaxy (prod and dev). On my dev instance (which is fully up to date), when I run the same job multiple times, sometimes it finishes and sometimes it dies, this is independent of which node it runs on. My prod instance is still at the Oct. 03 distribution and does not experience this problem. So I am afraid to update our production instance. Thanks in advance, Liisa From: Kyle Ellrott kellr...@soe.ucsc.edu To: Nate Coraor n...@bx.psu.edu Cc: galaxy-dev@lists.bx.psu.edu galaxy-dev@lists.bx.psu.edu Date: 10/01/2013 07:44 PM Subject:Re: [galaxy-dev] DRMAA runner weirdness Sent by:galaxy-dev-boun...@lists.bx.psu.edu I did a merge of galaxy-central that included the patch you posted today. The scheduling problem seems to have gone away. Although I'm still getting back 'Job output not returned from cluster' for errors. This seems odd, as the system previously would output stderr correctly. Kyle On Thu, Jan 10, 2013 at 8:30 AM, Nate Coraor n...@bx.psu.edu wrote: On Jan 9, 2013, at 12:18 AM, Kyle Ellrott wrote: I'm running a test Galaxy system on a cluster (merged galaxy-dist on Janurary 4th). And I've noticed some odd behavior from the DRMAA job runner. I'm running a multithread system, one web server, one job_manager, and three job_handlers. DRMAA is the default job runner (the command for tophat2 is drmaa://-V -l mem_total=7G -pe smp 2/), with SGE 6.2u5 being the engine underneath. My test involves trying to run three different Tophat2 jobs. The first two seem to start up (and get put on the SGE queue), but the third stays grey, with the job manager listing it in state 'new' with command line 'None'. It doesn't seem to leave this state. Both of the jobs that actually got onto the queue die (reasons unknown, but much to early, probably some tophat/bowtie problem), but one job is listed in error state with stderr as 'Job output not returned from cluster', while the other job (which is no longer in the SGE queue) is still listed as running. Hi Kyle, It sounds like there are bunch of issues here. Do you have any limits set as to the number of concurrent jobs allowed? If not, you may need to add a bit of debugging information to the manager or handler code to figure out why the 'new' job is not being dispatched for execution. For the 'error' job, more information about output collection should be available from the Galaxy server log. If you have general SGE problems this may not be Galaxy's fault. You do need to make sure that the stdout/stderr files are able to be properly copied back to the Galaxy server upon job completion. For the 'running' job, make sure you've got 'set_metadata_externally = True' in your Galaxy config. --nate Any ideas? Kyle ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] DRMAA runner weirdness
I'm running a test Galaxy system on a cluster (merged galaxy-dist on Janurary 4th). And I've noticed some odd behavior from the DRMAA job runner. I'm running a multithread system, one web server, one job_manager, and three job_handlers. DRMAA is the default job runner (the command for tophat2 is drmaa://-V -l mem_total=7G -pe smp 2/), with SGE 6.2u5 being the engine underneath. My test involves trying to run three different Tophat2 jobs. The first two seem to start up (and get put on the SGE queue), but the third stays grey, with the job manager listing it in state 'new' with command line 'None'. It doesn't seem to leave this state. Both of the jobs that actually got onto the queue die (reasons unknown, but much to early, probably some tophat/bowtie problem), but one job is listed in error state with stderr as 'Job output not returned from cluster', while the other job (which is no longer in the SGE queue) is still listed as running. Any ideas? Kyle ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DRMAA job will now be errored - Segmentation fault
On Nov 20, 2012, at 8:15 AM, Peter Cock wrote: On Thu, Nov 15, 2012 at 11:21 AM, Peter Cock p.j.a.c...@googlemail.com wrote: On Thu, Nov 15, 2012 at 10:12 AM, Peter Cock p.j.a.c...@googlemail.com wrote: On Thu, Nov 15, 2012 at 10:06 AM, Peter Cock p.j.a.c...@googlemail.com wrote: Hi all, Something has changed in the job handling, and in a bad way. On my development machine submitting jobs to the cluster didn't seem to be working anymore (never sent to SGE). I killed Galaxy and restarted: ... (segmentation fault) Looking into the problem with submitting the jobs, there seems to be a problem with task splitting somehow recursing - the same file is split four times, the filename getting longer and longer: Turning off task splitting I could run the same job OK on SGE. So, the good news is the problems seem to be specific to the task splitting code. Also I have reproduced the segmentation fault when restarting Galaxy (after stopping Galaxy with one of these broken jobs). Starting server in PID 17996. serving on http://127.0.0.1:8081 galaxy.jobs.runners.drmaa ERROR 2012-11-15 11:07:27,762 (327/None) Unable to check job status Traceback (most recent call last): File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, line 296, in check_watched_items state = self.ds.jobStatus( job_id ) File /mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/__init__.py, line 522, in jobStatus _h.c(_w.drmaa_job_ps, jobName, _ct.byref(status)) File /mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/helpers.py, line 213, in c return f(*(args + (error_buffer, sizeof(error_buffer File /mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/errors.py, line 90, in error_check raise _ERRORS[code-1](code %s: %s % (code, error_buffer.value)) InvalidArgumentException: code 4: Job id, None, is not a valid job id galaxy.jobs.runners.drmaa WARNING 2012-11-15 11:07:27,764 (327/None) job will now be errored ./run.sh: line 86: 17996 Segmentation fault (core dumped) python ./scripts/paster.py serve universe_wsgi.ini $@ The problem is the job_id variable is None (note this is a string, not the Python special object None) in check_watched_items(). Peter Is anyone else seeing this? I am wary of applying the update to our production Galaxy until I know how to resolve this (other than just be disabling task splitting). Hi Peter, These look like two issues - in one, you've got task(s) in the database that do not have an external runner ID set, causing the drmaa runner to attempt to check the status of None, resulting in the segfault. If you update the state of these tasks to something terminal, that should fix the issue with them. Of course, if the same things happens with new jobs, then there's another issue. I'm trying to reproduce the working directory behavior but have been unsuccessful. Do you have any local modifications to the splitting or jobs code? --nate Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DRMAA job will now be errored - Segmentation fault
On Nov 27, 2012, at 12:03 PM, Peter Cock wrote: On Tue, Nov 27, 2012 at 4:50 PM, Nate Coraor n...@bx.psu.edu wrote: On Nov 20, 2012, at 8:15 AM, Peter Cock wrote: Is anyone else seeing this? I am wary of applying the update to our production Galaxy until I know how to resolve this (other than just be disabling task splitting). Hi Peter, These look like two issues - in one, you've got task(s) in the database that do not have an external runner ID set, causing the drmaa runner to attempt to check the status of None, resulting in the segfault. So a little defensive coding could prevent the segfault then (leaving the separate issue of why the jobs lack this information)? Indeed, I pushed a check for this in 4a95ae9a26d9. If you update the state of these tasks to something terminal, that should fix the issue with them. You mean manually in the database? Restarting Galaxy seemed to achieve that in a round-about way. Of course, if the same things happens with new jobs, then there's another issue. This was a week ago, but yes, at the time it was reproducible with new jobs. Is that to say it's still happening and you've simply worked around it (by disabling tasks), or that it is no longer happening? I'm trying to reproduce the working directory behavior but have been unsuccessful. Do you have any local modifications to the splitting or jobs code? This was running on my tools branch, which shouldn't be changing Galaxy itself in any meaningful way (a few local variables did get accidentally checked into my run.sh file etc but otherwise I only try to modify new files specific to my individual tool wrappers): https://bitbucket.org/peterjc/galaxy-central/src/tools [galaxy@ppserver galaxy-central]$ hg branch tools [galaxy@ppserver galaxy-central]$ hg log -b tools | head -n 8 changeset: 8807:d49200df0707 branch: tools tag: tip parent: 8712:959ee7c79fd2 parent: 8806:340438c62171 user:peterjc p.j.a.c...@googlemail.com date:Thu Nov 15 09:38:57 2012 + summary: Merged default into my tools branch The only deliberate change was to try and debug this, [galaxy@ppserver galaxy-central]$ hg diff diff -r d49200df0707 lib/galaxy/jobs/runners/drmaa.py --- a/lib/galaxy/jobs/runners/drmaa.pyThu Nov 15 09:38:57 2012 + +++ b/lib/galaxy/jobs/runners/drmaa.pyTue Nov 27 17:00:04 2012 + @@ -291,8 +291,15 @@ for drm_job_state in self.watched: job_id = drm_job_state.job_id galaxy_job_id = drm_job_state.job_wrapper.job_id +if job_id is None or job_id==None: +log.exception((%s/%r) Unable to check job status none % ( galaxy_job_id, job_id ) ) +#drm_job_state.fail_message = Cluster could not complete job (job_id None) +#Ignore it? +#self.work_queue.put( ( 'fail', drm_job_state ) ) +continue old_state = drm_job_state.old_state try: +assert job_id is not None and job_id != None state = self.ds.jobStatus( job_id ) # InternalException was reported to be necessary on some DRMs, but # this could cause failures to be detected as completion! Please I'm about to go home for the day but should be able to look into this tomorrow, e.g. update to the latest default branch. Great, thanks. --nate Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DRMAA job will now be errored - Segmentation fault
On Tue, Nov 27, 2012 at 5:19 PM, Nate Coraor n...@bx.psu.edu wrote: So a little defensive coding could prevent the segfault then (leaving the separate issue of why the jobs lack this information)? Indeed, I pushed a check for this in 4a95ae9a26d9. Great. That will help. This was a week ago, but yes, at the time it was reproducible with new jobs. Is that to say it's still happening and you've simply worked around it (by disabling tasks), or that it is no longer happening? I've not tried it for a week - it was my development install that tracks galaxy-central which showed the problem, so I avoided updated our production install until resolving it. I'll see how it behaves later this week (although there is a mass job hogging the cluster queue which may complicate matters and reduce the turn around). Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DRMAA job will now be errored - Segmentation fault
On Thu, Nov 15, 2012 at 11:21 AM, Peter Cock p.j.a.c...@googlemail.com wrote: On Thu, Nov 15, 2012 at 10:12 AM, Peter Cock p.j.a.c...@googlemail.com wrote: On Thu, Nov 15, 2012 at 10:06 AM, Peter Cock p.j.a.c...@googlemail.com wrote: Hi all, Something has changed in the job handling, and in a bad way. On my development machine submitting jobs to the cluster didn't seem to be working anymore (never sent to SGE). I killed Galaxy and restarted: ... (segmentation fault) Looking into the problem with submitting the jobs, there seems to be a problem with task splitting somehow recursing - the same file is split four times, the filename getting longer and longer: Turning off task splitting I could run the same job OK on SGE. So, the good news is the problems seem to be specific to the task splitting code. Also I have reproduced the segmentation fault when restarting Galaxy (after stopping Galaxy with one of these broken jobs). Starting server in PID 17996. serving on http://127.0.0.1:8081 galaxy.jobs.runners.drmaa ERROR 2012-11-15 11:07:27,762 (327/None) Unable to check job status Traceback (most recent call last): File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, line 296, in check_watched_items state = self.ds.jobStatus( job_id ) File /mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/__init__.py, line 522, in jobStatus _h.c(_w.drmaa_job_ps, jobName, _ct.byref(status)) File /mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/helpers.py, line 213, in c return f(*(args + (error_buffer, sizeof(error_buffer File /mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/errors.py, line 90, in error_check raise _ERRORS[code-1](code %s: %s % (code, error_buffer.value)) InvalidArgumentException: code 4: Job id, None, is not a valid job id galaxy.jobs.runners.drmaa WARNING 2012-11-15 11:07:27,764 (327/None) job will now be errored ./run.sh: line 86: 17996 Segmentation fault (core dumped) python ./scripts/paster.py serve universe_wsgi.ini $@ The problem is the job_id variable is None (note this is a string, not the Python special object None) in check_watched_items(). Peter Is anyone else seeing this? I am wary of applying the update to our production Galaxy until I know how to resolve this (other than just be disabling task splitting). Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] DRMAA job will now be errored - Segmentation fault
Hi all, Something has changed in the job handling, and in a bad way. On my development machine submitting jobs to the cluster didn't seem to be working anymore (never sent to SGE). I killed Galaxy and restarted: Starting server in PID 12180. serving on http://127.0.0.1:8081 galaxy.jobs.runners.drmaa ERROR 2012-11-15 09:56:28,192 (320/None) Unable to check job status Traceback (most recent call last): File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, line 296, in check_watched_items state = self.ds.jobStatus( job_id ) File /mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/__init__.py, line 522, in jobStatus _h.c(_w.drmaa_job_ps, jobName, _ct.byref(status)) File /mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/helpers.py, line 213, in c return f(*(args + (error_buffer, sizeof(error_buffer File /mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/errors.py, line 90, in error_check raise _ERRORS[code-1](code %s: %s % (code, error_buffer.value)) InvalidArgumentException: code 4: Job id, None, is not a valid job id galaxy.jobs.runners.drmaa WARNING 2012-11-15 09:56:28,193 (320/None) job will now be errored ./run.sh: line 86: 12180 Segmentation fault (core dumped) python ./scripts/paster.py serve universe_wsgi.ini $@ I restarted and it happened again, third time lucky. I presume this was one segmentation fault for each orphaned/zombie job (since I'd tried two cluster jobs which got stuck). I was running with revision 340438c62171, https://bitbucket.org/galaxy/galaxy-central/changeset/340438c62171578078323d39da398d5053b69d0a as merged into my tools branch, https://bitbucket.org/peterjc/galaxy-central/changeset/d49200df0707579f41fc4f25042354604ce20e63 Any thoughts? Thanks, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DRMAA job will now be errored - Segmentation fault
On Thu, Nov 15, 2012 at 10:06 AM, Peter Cock p.j.a.c...@googlemail.com wrote: Hi all, Something has changed in the job handling, and in a bad way. On my development machine submitting jobs to the cluster didn't seem to be working anymore (never sent to SGE). I killed Galaxy and restarted: ... (segmentation fault) Looking into the problem with submitting the jobs, there seems to be a problem with task splitting somehow recursing - the same file is split four times, the filename getting longer and longer: galaxy.jobs DEBUG 2012-11-15 10:08:33,510 (321) Working directory for job is: /mnt/galaxy/galaxy-central/database/job_working_directory/000/321 galaxy.jobs.handler DEBUG 2012-11-15 10:08:33,510 dispatching job 321 to tasks runner galaxy.jobs.handler INFO 2012-11-15 10:08:33,714 (321) Job dispatched galaxy.datatypes.sequence DEBUG 2012-11-15 10:08:34,457 Split /mnt/galaxy/galaxy-central/database/files/000/dataset_344.dat into batches of 1000 records... galaxy.datatypes.sequence DEBUG 2012-11-15 10:08:34,457 Attemping to split FASTA file /mnt/galaxy/galaxy-central/database/files/000/dataset_344.dat into chunks of 1000 sequences galaxy.datatypes.sequence DEBUG 2012-11-15 10:08:34,458 Writing /mnt/galaxy/galaxy-central/database/files/000/dataset_344.dat part to /mnt/galaxy/galaxy-central/database/job_working_directory/000/321/task_0/dataset_344.dat galaxy.jobs.splitters.multi DEBUG 2012-11-15 10:08:34,458 do_split created 1 parts galaxy.jobs DEBUG 2012-11-15 10:08:34,558 (321) Working directory for job is: /mnt/galaxy/galaxy-central/database/job_working_directory/000/321 galaxy.jobs.handler DEBUG 2012-11-15 10:08:34,558 dispatching task 823, of job 321, to tasks runner 127.0.0.1 - - [15/Nov/2012:10:08:35 +0100] POST /root/history_item_updates HTTP/1.1 200 - http://127.0.0.1:8081/history; Mozilla/5.0 (X11; Linux x86_64; rv:10.0.8) Gecko/20121012 Firefox/10.0.8 galaxy.datatypes.sequence DEBUG 2012-11-15 10:08:35,458 Split /mnt/galaxy/galaxy-central/database/files/000/dataset_344.dat into batches of 1000 records... galaxy.datatypes.sequence DEBUG 2012-11-15 10:08:35,459 Attemping to split FASTA file /mnt/galaxy/galaxy-central/database/files/000/dataset_344.dat into chunks of 1000 sequences galaxy.datatypes.sequence DEBUG 2012-11-15 10:08:35,459 Writing /mnt/galaxy/galaxy-central/database/files/000/dataset_344.dat part to /mnt/galaxy/galaxy-central/database/job_working_directory/000/321/task_0/task_0/dataset_344.dat galaxy.jobs.splitters.multi DEBUG 2012-11-15 10:08:35,459 do_split created 1 parts galaxy.jobs DEBUG 2012-11-15 10:08:35,541 (321) Working directory for job is: /mnt/galaxy/galaxy-central/database/job_working_directory/000/321 galaxy.jobs.handler DEBUG 2012-11-15 10:08:35,542 dispatching task 824, of job 321, to tasks runner galaxy.datatypes.sequence DEBUG 2012-11-15 10:08:36,171 Split /mnt/galaxy/galaxy-central/database/files/000/dataset_344.dat into batches of 1000 records... galaxy.datatypes.sequence DEBUG 2012-11-15 10:08:36,171 Attemping to split FASTA file /mnt/galaxy/galaxy-central/database/files/000/dataset_344.dat into chunks of 1000 sequences galaxy.datatypes.sequence DEBUG 2012-11-15 10:08:36,171 Writing /mnt/galaxy/galaxy-central/database/files/000/dataset_344.dat part to /mnt/galaxy/galaxy-central/database/job_working_directory/000/321/task_0/task_0/task_0/dataset_344.dat galaxy.jobs.splitters.multi DEBUG 2012-11-15 10:08:36,172 do_split created 1 parts galaxy.jobs DEBUG 2012-11-15 10:08:36,232 (321) Working directory for job is: /mnt/galaxy/galaxy-central/database/job_working_directory/000/321 galaxy.jobs.handler DEBUG 2012-11-15 10:08:36,232 dispatching task 825, of job 321, to tasks runner galaxy.datatypes.sequence DEBUG 2012-11-15 10:08:36,843 Split /mnt/galaxy/galaxy-central/database/files/000/dataset_344.dat into batches of 1000 records... galaxy.datatypes.sequence DEBUG 2012-11-15 10:08:36,843 Attemping to split FASTA file /mnt/galaxy/galaxy-central/database/files/000/dataset_344.dat into chunks of 1000 sequences galaxy.datatypes.sequence DEBUG 2012-11-15 10:08:36,843 Writing /mnt/galaxy/galaxy-central/database/files/000/dataset_344.dat part to /mnt/galaxy/galaxy-central/database/job_working_directory/000/321/task_0/task_0/task_0/task_0/dataset_344.dat galaxy.jobs.splitters.multi DEBUG 2012-11-15 10:08:36,844 do_split created 1 parts galaxy.jobs DEBUG 2012-11-15 10:08:36,906 (321) Working directory for job is: /mnt/galaxy/galaxy-central/database/job_working_directory/000/321 galaxy.jobs.handler DEBUG 2012-11-15 10:08:36,906 dispatching task 826, of job 321, to tasks runner Hmm. Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DRMAA job will now be errored - Segmentation fault
On Thu, Nov 15, 2012 at 10:12 AM, Peter Cock p.j.a.c...@googlemail.com wrote: On Thu, Nov 15, 2012 at 10:06 AM, Peter Cock p.j.a.c...@googlemail.com wrote: Hi all, Something has changed in the job handling, and in a bad way. On my development machine submitting jobs to the cluster didn't seem to be working anymore (never sent to SGE). I killed Galaxy and restarted: ... (segmentation fault) Looking into the problem with submitting the jobs, there seems to be a problem with task splitting somehow recursing - the same file is split four times, the filename getting longer and longer: Turning off task splitting I could run the same job OK on SGE. So, the good news is the problems seem to be specific to the task splitting code. Also I have reproduced the segmentation fault when restarting Galaxy (after stopping Galaxy with one of these broken jobs). Starting server in PID 17996. serving on http://127.0.0.1:8081 galaxy.jobs.runners.drmaa ERROR 2012-11-15 11:07:27,762 (327/None) Unable to check job status Traceback (most recent call last): File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, line 296, in check_watched_items state = self.ds.jobStatus( job_id ) File /mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/__init__.py, line 522, in jobStatus _h.c(_w.drmaa_job_ps, jobName, _ct.byref(status)) File /mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/helpers.py, line 213, in c return f(*(args + (error_buffer, sizeof(error_buffer File /mnt/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.6.egg/drmaa/errors.py, line 90, in error_check raise _ERRORS[code-1](code %s: %s % (code, error_buffer.value)) InvalidArgumentException: code 4: Job id, None, is not a valid job id galaxy.jobs.runners.drmaa WARNING 2012-11-15 11:07:27,764 (327/None) job will now be errored ./run.sh: line 86: 17996 Segmentation fault (core dumped) python ./scripts/paster.py serve universe_wsgi.ini $@ The problem is the job_id variable is None (note this is a string, not the Python special object None) in check_watched_items(). Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DRMAA: TypeError: check_tool_output() takes exactly 5 arguments (4 given)
On Tue, Sep 18, 2012 at 7:11 PM, Scott McManus scottmcma...@gatech.edu wrote: Sorry - that's changeset 7714:3f12146d6d81 -Scott Hi Scott, The good news is this error does seem to be fixed as of that commit: TypeError: check_tool_output() takes exactly 5 arguments (4 given) The bad news is my cluster jobs still aren't working properly (using a job splitter). The jobs seem to run, get submitted to the cluster, and finish, and the data looks OK via the 'eye' view icon, but is red in the history with: 0 bytes An error occurred running this job: info unavailable I will investigate - it is likely due to another change... perhaps in the new stdout/stderr/return code support? Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DRMAA: TypeError: check_tool_output() takes exactly 5 arguments (4 given)
Odd, it works for me on EC2/Cloudman. jorrit On 09/19/2012 03:29 PM, Peter Cock wrote: On Tue, Sep 18, 2012 at 7:11 PM, Scott McManus scottmcma...@gatech.edu wrote: Sorry - that's changeset 7714:3f12146d6d81 -Scott Hi Scott, The good news is this error does seem to be fixed as of that commit: TypeError: check_tool_output() takes exactly 5 arguments (4 given) The bad news is my cluster jobs still aren't working properly (using a job splitter). The jobs seem to run, get submitted to the cluster, and finish, and the data looks OK via the 'eye' view icon, but is red in the history with: 0 bytes An error occurred running this job: info unavailable I will investigate - it is likely due to another change... perhaps in the new stdout/stderr/return code support? Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] DRMAA: TypeError: check_tool_output() takes exactly 5 arguments (4 given)
Hi all (and in particular, Scott), I've just updated my development server and found the following error when running jobs on our SGE cluster via DRMMA: galaxy.jobs.runners.drmaa ERROR 2012-09-18 09:43:20,698 Job wrapper finish method failed Traceback (most recent call last): File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, line 371, in finish_job drm_job_state.job_wrapper.finish( stdout, stderr, exit_code ) File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/__init__.py, line 1048, in finish if ( self.check_tool_output( stdout, stderr, tool_exit_code ) ): TypeError: check_tool_output() takes exactly 5 arguments (4 given) This looks to have been introduced in this commit: https://bitbucket.org/galaxy/galaxy-central/changeset/f557b7b05fdd701cbf99ee04f311bcadb1ae29c4#chg-lib/galaxy/jobs/__init__.py There should be an additional jobs argument, proposed fix: $ hg diff lib/galaxy/jobs/__init__.py diff -r 4007494e37e1 lib/galaxy/jobs/__init__.py --- a/lib/galaxy/jobs/__init__.py Tue Sep 18 09:40:19 2012 +0100 +++ b/lib/galaxy/jobs/__init__.py Tue Sep 18 10:06:44 2012 +0100 @@ -1045,7 +1045,8 @@ # Check what the tool returned. If the stdout or stderr matched # regular expressions that indicate errors, then set an error. # The same goes if the tool's exit code was in a given range. -if ( self.check_tool_output( stdout, stderr, tool_exit_code ) ): +job = self.get_job() +if ( self.check_tool_output( stdout, stderr, tool_exit_code, job ) ): task.state = task.states.OK else: task.state = task.states.ERROR (Let me know if you want this as a pull request - it seems a lot of effort for a tiny change.) Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DRMAA: TypeError: check_tool_output() takes exactly 5 arguments (4 given)
I'll check it out. Thanks. - Original Message - Hi all (and in particular, Scott), I've just updated my development server and found the following error when running jobs on our SGE cluster via DRMMA: galaxy.jobs.runners.drmaa ERROR 2012-09-18 09:43:20,698 Job wrapper finish method failed Traceback (most recent call last): File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, line 371, in finish_job drm_job_state.job_wrapper.finish( stdout, stderr, exit_code ) File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/__init__.py, line 1048, in finish if ( self.check_tool_output( stdout, stderr, tool_exit_code ) ): TypeError: check_tool_output() takes exactly 5 arguments (4 given) This looks to have been introduced in this commit: https://bitbucket.org/galaxy/galaxy-central/changeset/f557b7b05fdd701cbf99ee04f311bcadb1ae29c4#chg-lib/galaxy/jobs/__init__.py There should be an additional jobs argument, proposed fix: $ hg diff lib/galaxy/jobs/__init__.py diff -r 4007494e37e1 lib/galaxy/jobs/__init__.py --- a/lib/galaxy/jobs/__init__.py Tue Sep 18 09:40:19 2012 +0100 +++ b/lib/galaxy/jobs/__init__.py Tue Sep 18 10:06:44 2012 +0100 @@ -1045,7 +1045,8 @@ # Check what the tool returned. If the stdout or stderr matched # regular expressions that indicate errors, then set an error. # The same goes if the tool's exit code was in a given range. -if ( self.check_tool_output( stdout, stderr, tool_exit_code ) ): +job = self.get_job() +if ( self.check_tool_output( stdout, stderr, tool_exit_code, job ) ): task.state = task.states.OK else: task.state = task.states.ERROR (Let me know if you want this as a pull request - it seems a lot of effort for a tiny change.) Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DRMAA: TypeError: check_tool_output() takes exactly 5 arguments (4 given)
I have to admit that I'm a little confused as to why you would be getting this error at all - the job variable is introduced at line 298 in the same file, and it's used as the last variable to check_tool_output in the changeset you pointed to. (Also, thanks for pointing to it - that made investigating easier.) Is it possible that there was a merge problem when you pulled the latest set of code? For my own sanity, would you mind downloading a fresh copy of galaxy-central or galaxy-dist into a separate directory and see if the problem is still there? (I fully admit that there could be a bug that I left in, but all job runners should have stumbled across the same problem - the finish method should be called by all job runners.) Thanks again! -Scott - Original Message - I'll check it out. Thanks. - Original Message - Hi all (and in particular, Scott), I've just updated my development server and found the following error when running jobs on our SGE cluster via DRMMA: galaxy.jobs.runners.drmaa ERROR 2012-09-18 09:43:20,698 Job wrapper finish method failed Traceback (most recent call last): File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, line 371, in finish_job drm_job_state.job_wrapper.finish( stdout, stderr, exit_code ) File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/__init__.py, line 1048, in finish if ( self.check_tool_output( stdout, stderr, tool_exit_code ) ): TypeError: check_tool_output() takes exactly 5 arguments (4 given) This looks to have been introduced in this commit: https://bitbucket.org/galaxy/galaxy-central/changeset/f557b7b05fdd701cbf99ee04f311bcadb1ae29c4#chg-lib/galaxy/jobs/__init__.py There should be an additional jobs argument, proposed fix: $ hg diff lib/galaxy/jobs/__init__.py diff -r 4007494e37e1 lib/galaxy/jobs/__init__.py --- a/lib/galaxy/jobs/__init__.py Tue Sep 18 09:40:19 2012 +0100 +++ b/lib/galaxy/jobs/__init__.py Tue Sep 18 10:06:44 2012 +0100 @@ -1045,7 +1045,8 @@ # Check what the tool returned. If the stdout or stderr matched # regular expressions that indicate errors, then set an error. # The same goes if the tool's exit code was in a given range. -if ( self.check_tool_output( stdout, stderr, tool_exit_code ) ): +job = self.get_job() +if ( self.check_tool_output( stdout, stderr, tool_exit_code, job ) ): task.state = task.states.OK else: task.state = task.states.ERROR (Let me know if you want this as a pull request - it seems a lot of effort for a tiny change.) Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DRMAA: TypeError: check_tool_output() takes exactly 5 arguments (4 given)
Is it possible that you are looking at different classes? TaskWrapper's finish method does not use the job variable in my recently merged code either (line ~1045), while JobWrapper's does around line 315. cheers, jorrit On 09/18/2012 03:55 PM, Scott McManus wrote: I have to admit that I'm a little confused as to why you would be getting this error at all - the job variable is introduced at line 298 in the same file, and it's used as the last variable to check_tool_output in the changeset you pointed to. (Also, thanks for pointing to it - that made investigating easier.) Is it possible that there was a merge problem when you pulled the latest set of code? For my own sanity, would you mind downloading a fresh copy of galaxy-central or galaxy-dist into a separate directory and see if the problem is still there? (I fully admit that there could be a bug that I left in, but all job runners should have stumbled across the same problem - the finish method should be called by all job runners.) Thanks again! -Scott - Original Message - I'll check it out. Thanks. - Original Message - Hi all (and in particular, Scott), I've just updated my development server and found the following error when running jobs on our SGE cluster via DRMMA: galaxy.jobs.runners.drmaa ERROR 2012-09-18 09:43:20,698 Job wrapper finish method failed Traceback (most recent call last): File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, line 371, in finish_job drm_job_state.job_wrapper.finish( stdout, stderr, exit_code ) File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/__init__.py, line 1048, in finish if ( self.check_tool_output( stdout, stderr, tool_exit_code ) ): TypeError: check_tool_output() takes exactly 5 arguments (4 given) This looks to have been introduced in this commit: https://bitbucket.org/galaxy/galaxy-central/changeset/f557b7b05fdd701cbf99ee04f311bcadb1ae29c4#chg-lib/galaxy/jobs/__init__.py There should be an additional jobs argument, proposed fix: $ hg diff lib/galaxy/jobs/__init__.py diff -r 4007494e37e1 lib/galaxy/jobs/__init__.py --- a/lib/galaxy/jobs/__init__.py Tue Sep 18 09:40:19 2012 +0100 +++ b/lib/galaxy/jobs/__init__.py Tue Sep 18 10:06:44 2012 +0100 @@ -1045,7 +1045,8 @@ # Check what the tool returned. If the stdout or stderr matched # regular expressions that indicate errors, then set an error. # The same goes if the tool's exit code was in a given range. -if ( self.check_tool_output( stdout, stderr, tool_exit_code ) ): +job = self.get_job() +if ( self.check_tool_output( stdout, stderr, tool_exit_code, job ) ): task.state = task.states.OK else: task.state = task.states.ERROR (Let me know if you want this as a pull request - it seems a lot of effort for a tiny change.) Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DRMAA: TypeError: check_tool_output() takes exactly 5 arguments (4 given)
Thanks, Jorrit! That was a good catch. Yes, it's a problem with the TaskWrapper. I'll see what I can do about it. -Scott - Original Message - Is it possible that you are looking at different classes? TaskWrapper's finish method does not use the job variable in my recently merged code either (line ~1045), while JobWrapper's does around line 315. cheers, jorrit On 09/18/2012 03:55 PM, Scott McManus wrote: I have to admit that I'm a little confused as to why you would be getting this error at all - the job variable is introduced at line 298 in the same file, and it's used as the last variable to check_tool_output in the changeset you pointed to. (Also, thanks for pointing to it - that made investigating easier.) Is it possible that there was a merge problem when you pulled the latest set of code? For my own sanity, would you mind downloading a fresh copy of galaxy-central or galaxy-dist into a separate directory and see if the problem is still there? (I fully admit that there could be a bug that I left in, but all job runners should have stumbled across the same problem - the finish method should be called by all job runners.) Thanks again! -Scott - Original Message - I'll check it out. Thanks. - Original Message - Hi all (and in particular, Scott), I've just updated my development server and found the following error when running jobs on our SGE cluster via DRMMA: galaxy.jobs.runners.drmaa ERROR 2012-09-18 09:43:20,698 Job wrapper finish method failed Traceback (most recent call last): File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, line 371, in finish_job drm_job_state.job_wrapper.finish( stdout, stderr, exit_code ) File /mnt/galaxy/galaxy-central/lib/galaxy/jobs/__init__.py, line 1048, in finish if ( self.check_tool_output( stdout, stderr, tool_exit_code ) ): TypeError: check_tool_output() takes exactly 5 arguments (4 given) This looks to have been introduced in this commit: https://bitbucket.org/galaxy/galaxy-central/changeset/f557b7b05fdd701cbf99ee04f311bcadb1ae29c4#chg-lib/galaxy/jobs/__init__.py There should be an additional jobs argument, proposed fix: $ hg diff lib/galaxy/jobs/__init__.py diff -r 4007494e37e1 lib/galaxy/jobs/__init__.py --- a/lib/galaxy/jobs/__init__.py Tue Sep 18 09:40:19 2012 +0100 +++ b/lib/galaxy/jobs/__init__.py Tue Sep 18 10:06:44 2012 +0100 @@ -1045,7 +1045,8 @@ # Check what the tool returned. If the stdout or stderr matched # regular expressions that indicate errors, then set an error. # The same goes if the tool's exit code was in a given range. -if ( self.check_tool_output( stdout, stderr, tool_exit_code ) ): +job = self.get_job() +if ( self.check_tool_output( stdout, stderr, tool_exit_code, job ) ): task.state = task.states.OK else: task.state = task.states.ERROR (Let me know if you want this as a pull request - it seems a lot of effort for a tiny change.) Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DRMAA: TypeError: check_tool_output() takes exactly 5 arguments (4 given)
On Tue, Sep 18, 2012 at 3:09 PM, Jorrit Boekel jorrit.boe...@scilifelab.se wrote: Is it possible that you are looking at different classes? TaskWrapper's finish method does not use the job variable in my recently merged code either (line ~1045), while JobWrapper's does around line 315. cheers, jorrit Yes exactly (as per my follow up email sent just before yours ;) ) Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DRMAA: TypeError: check_tool_output() takes exactly 5 arguments (4 given)
Ok - that change was made. The difference is that the change is applied to the task instead of the job. It's in changeset 7713:bfd10aa67c78, and it ran successfully in my environments on local, pbs, and drmaa runners. Let me know if there are any problems. Thanks again for your patience. -Scott - Original Message - On Tue, Sep 18, 2012 at 3:09 PM, Jorrit Boekel jorrit.boe...@scilifelab.se wrote: Is it possible that you are looking at different classes? TaskWrapper's finish method does not use the job variable in my recently merged code either (line ~1045), while JobWrapper's does around line 315. cheers, jorrit Yes exactly (as per my follow up email sent just before yours ;) ) Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DRMAA: TypeError: check_tool_output() takes exactly 5 arguments (4 given)
Sorry - that's changeset 7714:3f12146d6d81 -Scott - Original Message - Ok - that change was made. The difference is that the change is applied to the task instead of the job. It's in changeset 7713:bfd10aa67c78, and it ran successfully in my environments on local, pbs, and drmaa runners. Let me know if there are any problems. Thanks again for your patience. -Scott - Original Message - On Tue, Sep 18, 2012 at 3:09 PM, Jorrit Boekel jorrit.boe...@scilifelab.se wrote: Is it possible that you are looking at different classes? TaskWrapper's finish method does not use the job variable in my recently merged code either (line ~1045), while JobWrapper's does around line 315. cheers, jorrit Yes exactly (as per my follow up email sent just before yours ;) ) Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] drmaa module does not load
Le 27/03/2012 11:03, Louise-Amélie Schmitt a écrit : Le 26/03/2012 16:13, Nate Coraor a écrit : On Mar 26, 2012, at 5:11 AM, Louise-Amélie Schmitt wrote: Hello everyone, I wanted to start the drmaa job runner and followed the instructions in the wiki, but I have this error message when I start Galaxy: galaxy.jobs ERROR 2012-03-23 15:28:49,845 Job runner is not loadable: galaxy.jobs.runners. drmaa Traceback (most recent call last): File /g/funcgen/galaxy/lib/galaxy/jobs/__init__.py, line 1195, in _load_plugin module = __import__( module_name ) ImportError: No module named drmaa I checked /g/funcgen/galaxy/lib/galaxy/jobs/runners and it contains the drmaa.py file There was no drmaa egg so I made a copy of it from our other Galaxy install but it didn't solve the problem. I don't really know where to start looking, any idea? Thanks, L-A Hi L-A, There's an errant space in the runner name: ' drmaa'. I am going to guess that your start_job_runners looks like: start_job_runners = pbs, drmaa Only the whitespace at the beginning and end of that parameter is stripped. I've committed a fix for this that'll be in the next galaxy-dist, but in the meantime, remove the space after the comma and the drmaa runner should load. --nate Hi Nate, After a removing the space (and a facepalm) it now fetches the drmaa egg and starts the module properly, thanks a lot! I still have an issue though: When I use the run shell script Galaxy crashes right before loading the drmaa runner (right after the pbs runner is loaded). The weird thing is that when I launch the command manually it works fine: python ./scripts/paster.py serve universe_wsgi.runner.ini --server-name=runner0 --pid-file=runner0.pid --log-file=runner0.log --daemon The other weird thing is that I get no error message at all. I'll try looking into it but if you have any idea about what's going wrong, it would help greatly :) Thanks again, L-A ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ Ok it looks like I didn't notice the changes in the start scripts and since we use two custom copies of run.sh the code was outdated. I corrected that and it now seems to work properly. Best, L-A ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] drmaa module does not load
Hello everyone, I wanted to start the drmaa job runner and followed the instructions in the wiki, but I have this error message when I start Galaxy: galaxy.jobs ERROR 2012-03-23 15:28:49,845 Job runner is not loadable: galaxy.jobs.runners. drmaa Traceback (most recent call last): File /g/funcgen/galaxy/lib/galaxy/jobs/__init__.py, line 1195, in _load_plugin module = __import__( module_name ) ImportError: No module named drmaa I checked /g/funcgen/galaxy/lib/galaxy/jobs/runners and it contains the drmaa.py file There was no drmaa egg so I made a copy of it from our other Galaxy install but it didn't solve the problem. I don't really know where to start looking, any idea? Thanks, L-A ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DRMAA error with latest update 26920e20157f
Figured it out. It was an error introduced while resolving version control conflicts. -- Shantanu On Jan 29, 2012, at 9:12 PM, Shantanu Pavgi wrote: I am getting following error with the latest galaxy-dist revision '26920e20157f' update. The Python version is 2.6.6. {{{ galaxy.jobs.runners.drmaa ERROR 2012-01-29 21:00:28,577 Uncaught exception queueing job Traceback (most recent call last): File /projects/galaxy/galaxy-165/lib/galaxy/jobs/runners/drmaa.py, line 140, in run_next self.queue_job( obj ) File /projects/galaxy/galaxy-165/lib/galaxy/jobs/runners/drmaa.py, line 190, in queue_job command_line ) TypeError: not all arguments converted during string formatting }}} I was wondering if anyone else is experiencing this same issue. The system works fine when I rollback to revision 'b258de1e6cea'. Are there any additional configuration details required with the latest revision that I am missing?? -- Shantanu ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] DRMAA error with latest update 26920e20157f
I am getting following error with the latest galaxy-dist revision '26920e20157f' update. The Python version is 2.6.6. {{{ galaxy.jobs.runners.drmaa ERROR 2012-01-29 21:00:28,577 Uncaught exception queueing job Traceback (most recent call last): File /projects/galaxy/galaxy-165/lib/galaxy/jobs/runners/drmaa.py, line 140, in run_next self.queue_job( obj ) File /projects/galaxy/galaxy-165/lib/galaxy/jobs/runners/drmaa.py, line 190, in queue_job command_line ) TypeError: not all arguments converted during string formatting }}} I was wondering if anyone else is experiencing this same issue. The system works fine when I rollback to revision 'b258de1e6cea'. Are there any additional configuration details required with the latest revision that I am missing?? -- Shantanu ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DRMAA broken following SGE update. How to fix?
BUMP Does anyone have any idea on this? Our Galaxy is currently out of action until this is sorted. Thanks, Chris On 22/08/11 10:08, Chris Cole wrote: Hi, Following a recent update to our SGE, DRMAA is failing to load in galaxy. The reason being that the path has changed. How do I change the path for galaxy to find the libdrmaa module? Cheers, Chris ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DRMAA broken following SGE update. How to fix?
Hi Chris, Take a look at this file : lib/galaxy/jobs/runners/drmaa.py in your galaxy directory. You can export a path for binaries here. For example : export PATH=$PATH:/opt/bin/ before export PYTHONPATH Moreover, do not forget other path values in your service script or in your galaxy user profile. Hope this help. 2011/8/23 Roman Valls brainst...@nopcode.org Did you adjust the SGE_ROOT environment variable to point to the libdrmaa for SGE (probably /opt/sge_62u5_gr/bin/lx24-amd64) ? This is of course just a guess, could you please provide some error messages/logs ? Cheers, Roman On 2011-08-23 10:20, Chris Cole wrote: BUMP Does anyone have any idea on this? Our Galaxy is currently out of action until this is sorted. Thanks, Chris On 22/08/11 10:08, Chris Cole wrote: Hi, Following a recent update to our SGE, DRMAA is failing to load in galaxy. The reason being that the path has changed. How do I change the path for galaxy to find the libdrmaa module? Cheers, Chris ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] DRMAA broken following SGE update. How to fix?
Hi, Following a recent update to our SGE, DRMAA is failing to load in galaxy. The reason being that the path has changed. How do I change the path for galaxy to find the libdrmaa module? Cheers, Chris -- Dr Chris Cole Senior Research Associate (Bioinformatics) College of Life Sciences University of Dundee Dow Street Dundee DD1 5EH Scotland, UK url: http://network.nature.com/profile/drchriscole e-mail: ch...@compbio.dundee.ac.uk Tel: +44 (0)1382 388 721 The University of Dundee is a registered Scottish charity, No: SC015096 ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DRMAA options for SGE
Hi Ambarish, Using what I had in my previous message: mytoolname = drmaa://-w n -l mem_free=1G -l mem_token=1G -l h_vmem=1G/ my jobs do get submitted to the cluster and I see the other options from qstat of the jobs ie. hard resource_list: mem_free=1G,mem_token=1G,h_vmem=1G Ka Ming From: ambarish biswas [ambarishbis...@gmail.com] Sent: July 27, 2011 4:18 PM To: Ka Ming Nip Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] DRMAA options for SGE Hi Ming, just an idea, -w n might be hiding the error reporting. Is your jobs getting submitted and executed correctly? For the queuing , you can add galaxy at the end, makes it drmaa://-w n -l mem_free=1G -l mem_token=1G -l h_vmem=1G/galaxy here galaxy is queue name. With Regards, Ambarish Biswas, University of Otago Department of Biochemistry, Dunedin, New Zealand, Tel: +64(22)0855647 Fax: +64(0)3 479 7866 On Thu, Jul 28, 2011 at 6:21 AM, Ka Ming Nip km...@bcgsc.camailto:km...@bcgsc.ca wrote: Answering myself here... I don't see the error anymore after adding -w n: [galaxy:tool_runners] ... mytoolname = drmaa://-w n -l mem_free=1G -l mem_token=1G -l h_vmem=1G/ ... Is this what other Galaxy admins do when using drmaa native options for their SGE? Ka Ming From: galaxy-dev-boun...@lists.bx.psu.edumailto:galaxy-dev-boun...@lists.bx.psu.edu [galaxy-dev-boun...@lists.bx.psu.edumailto:galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of Ka Ming Nip [km...@bcgsc.camailto:km...@bcgsc.ca] Sent: July 26, 2011 10:39 AM To: galaxy-dev@lists.bx.psu.edumailto:galaxy-dev@lists.bx.psu.edu Subject: [galaxy-dev] DRMAA options for SGE Hi, I am trying to configure the proper memory resource requests for my Galaxy tool. This is what I have under the tool_runners section of universe_wsgi.ini [galaxy:tool_runners] ... mytoolname = drmaa://-l mem_free=1G -l mem_token=1G -l h_vmem=1G/ ... When I execute my tool on Galaxy, I get the error below in the shell that I ran sh run.sh: galaxy.jobs.runners.drmaa ERROR 2011-07-26 09:54:01,930 Uncaught exception queueing job Traceback (most recent call last): File /home/kmnip/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, line 112, in run_next self.queue_job( obj ) File /home/kmnip/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, line 177, in queue_job job_id = self.ds.runJob(jt) File /home/kmnip/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.4.egg/drmaa/__init__.py, line 331, in runJob File /home/kmnip/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.4.egg/drmaa/helpers.py, line 213, in c File /home/kmnip/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.4.egg/drmaa/errors.py, line 90, in error_check DeniedByDrmException: code 17: error: no suitable queues All the flags I used work with qsub commands on the SGE cluster I use. The tool runs when I comment out the line in universe_wsgi.ini. Thanks, Ka Ming ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DRMAA options for SGE
Hi, this is what is working for me, the the Output not returned from cluster error is also disappeared. default_cluster_job_runner = drmaa://-q galaxy -V/ if this works, then check your other options to test. With Regards, Ambarish Biswas, University of Otago Department of Biochemistry, Dunedin, New Zealand, Tel: +64(22)0855647 Fax: +64(0)3 479 7866 On Fri, Jul 29, 2011 at 6:32 AM, Ka Ming Nip km...@bcgsc.ca wrote: Hi Ambarish, Using what I had in my previous message: mytoolname = drmaa://-w n -l mem_free=1G -l mem_token=1G -l h_vmem=1G/ my jobs do get submitted to the cluster and I see the other options from qstat of the jobs ie. hard resource_list: mem_free=1G,mem_token=1G,h_vmem=1G Ka Ming From: ambarish biswas [ambarishbis...@gmail.com] Sent: July 27, 2011 4:18 PM To: Ka Ming Nip Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] DRMAA options for SGE Hi Ming, just an idea, -w n might be hiding the error reporting. Is your jobs getting submitted and executed correctly? For the queuing , you can add galaxy at the end, makes it drmaa://-w n -l mem_free=1G -l mem_token=1G -l h_vmem=1G/galaxy here galaxy is queue name. With Regards, Ambarish Biswas, University of Otago Department of Biochemistry, Dunedin, New Zealand, Tel: +64(22)0855647 Fax: +64(0)3 479 7866 On Thu, Jul 28, 2011 at 6:21 AM, Ka Ming Nip km...@bcgsc.camailto: km...@bcgsc.ca wrote: Answering myself here... I don't see the error anymore after adding -w n: [galaxy:tool_runners] ... mytoolname = drmaa://-w n -l mem_free=1G -l mem_token=1G -l h_vmem=1G/ ... Is this what other Galaxy admins do when using drmaa native options for their SGE? Ka Ming From: galaxy-dev-boun...@lists.bx.psu.edumailto: galaxy-dev-boun...@lists.bx.psu.edu [galaxy-dev-boun...@lists.bx.psu.edu mailto:galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of Ka Ming Nip [ km...@bcgsc.camailto:km...@bcgsc.ca] Sent: July 26, 2011 10:39 AM To: galaxy-dev@lists.bx.psu.edumailto:galaxy-dev@lists.bx.psu.edu Subject: [galaxy-dev] DRMAA options for SGE Hi, I am trying to configure the proper memory resource requests for my Galaxy tool. This is what I have under the tool_runners section of universe_wsgi.ini [galaxy:tool_runners] ... mytoolname = drmaa://-l mem_free=1G -l mem_token=1G -l h_vmem=1G/ ... When I execute my tool on Galaxy, I get the error below in the shell that I ran sh run.sh: galaxy.jobs.runners.drmaa ERROR 2011-07-26 09:54:01,930 Uncaught exception queueing job Traceback (most recent call last): File /home/kmnip/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, line 112, in run_next self.queue_job( obj ) File /home/kmnip/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, line 177, in queue_job job_id = self.ds.runJob(jt) File /home/kmnip/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.4.egg/drmaa/__init__.py, line 331, in runJob File /home/kmnip/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.4.egg/drmaa/helpers.py, line 213, in c File /home/kmnip/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.4.egg/drmaa/errors.py, line 90, in error_check DeniedByDrmException: code 17: error: no suitable queues All the flags I used work with qsub commands on the SGE cluster I use. The tool runs when I comment out the line in universe_wsgi.ini. Thanks, Ka Ming ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DRMAA options for SGE
Answering myself here... I don't see the error anymore after adding -w n: [galaxy:tool_runners] ... mytoolname = drmaa://-w n -l mem_free=1G -l mem_token=1G -l h_vmem=1G/ ... Is this what other Galaxy admins do when using drmaa native options for their SGE? Ka Ming From: galaxy-dev-boun...@lists.bx.psu.edu [galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of Ka Ming Nip [km...@bcgsc.ca] Sent: July 26, 2011 10:39 AM To: galaxy-dev@lists.bx.psu.edu Subject: [galaxy-dev] DRMAA options for SGE Hi, I am trying to configure the proper memory resource requests for my Galaxy tool. This is what I have under the tool_runners section of universe_wsgi.ini [galaxy:tool_runners] ... mytoolname = drmaa://-l mem_free=1G -l mem_token=1G -l h_vmem=1G/ ... When I execute my tool on Galaxy, I get the error below in the shell that I ran sh run.sh: galaxy.jobs.runners.drmaa ERROR 2011-07-26 09:54:01,930 Uncaught exception queueing job Traceback (most recent call last): File /home/kmnip/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, line 112, in run_next self.queue_job( obj ) File /home/kmnip/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, line 177, in queue_job job_id = self.ds.runJob(jt) File /home/kmnip/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.4.egg/drmaa/__init__.py, line 331, in runJob File /home/kmnip/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.4.egg/drmaa/helpers.py, line 213, in c File /home/kmnip/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.4.egg/drmaa/errors.py, line 90, in error_check DeniedByDrmException: code 17: error: no suitable queues All the flags I used work with qsub commands on the SGE cluster I use. The tool runs when I comment out the line in universe_wsgi.ini. Thanks, Ka Ming ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] DRMAA options for SGE
Hi, I am trying to configure the proper memory resource requests for my Galaxy tool. This is what I have under the tool_runners section of universe_wsgi.ini [galaxy:tool_runners] ... mytoolname = drmaa://-l mem_free=1G -l mem_token=1G -l h_vmem=1G/ ... When I execute my tool on Galaxy, I get the error below in the shell that I ran sh run.sh: galaxy.jobs.runners.drmaa ERROR 2011-07-26 09:54:01,930 Uncaught exception queueing job Traceback (most recent call last): File /home/kmnip/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, line 112, in run_next self.queue_job( obj ) File /home/kmnip/galaxy/galaxy-central/lib/galaxy/jobs/runners/drmaa.py, line 177, in queue_job job_id = self.ds.runJob(jt) File /home/kmnip/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.4.egg/drmaa/__init__.py, line 331, in runJob File /home/kmnip/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.4.egg/drmaa/helpers.py, line 213, in c File /home/kmnip/galaxy/galaxy-central/eggs/drmaa-0.4b3-py2.4.egg/drmaa/errors.py, line 90, in error_check DeniedByDrmException: code 17: error: no suitable queues All the flags I used work with qsub commands on the SGE cluster I use. The tool runs when I comment out the line in universe_wsgi.ini. Thanks, Ka Ming ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] drmaa://native/ : native options are ignored
Hi, I'm working on an local installation of galaxy using torque with drmaa (the pbs-torque scramble failed). The torque-drmaa works fine so far, except for one issue. I'd like to specify some tool-dependent requirements from the tool_runners section in universe.wsgi.ini. For now I've been testing it with the setting below to have global native arguments : default_cluster_job_runner = drmaa://-l mem=4gb:nodes=1:ppn=6/ This should request 4gb of memory on a single node with 6 threads, but these requests are ignored. They are not listed on 'qstat -R' and more simultaneous jobs than possible are started if the requirements were taken into account. What am I missing here? Best regards, Geert Vandeweyer ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] drmaa://native/ : native options are ignored
Hi default_cluster_job_runner = drmaa://-q srpipeline -P pipeline/ works for me on LSF, so your syntax seems to be correct. Assuming that -l mem=4gb:nodes=1:ppn=6 works teh way you expect when you start the jobs on your cluster from the shell, read on... Bearing in mind the the value of the option contains ':' , I'd check that it is read correctly by the parser that parses universe_wsgi.ini and it is passed correctly to drmaa. If it's passed to drmaa correctly, but does not produced desired effect, I'd look closely at the drmaa library. This might be a bug. It's possible to use the drmaa library from a C script - this way you can test if drmaa works the way you want. Also, It was rather straightforward to generate a perl swig wrapper for drmaa and write perl test scripts. Swig wrappers can be generated for any scripting language and also java. Marina On 20/06/2011 10:53, Geert Vandeweyer wrote: Hi, I'm working on an local installation of galaxy using torque with drmaa (the pbs-torque scramble failed). The torque-drmaa works fine so far, except for one issue. I'd like to specify some tool-dependent requirements from the tool_runners section in universe.wsgi.ini. For now I've been testing it with the setting below to have global native arguments : default_cluster_job_runner = drmaa://-l mem=4gb:nodes=1:ppn=6/ This should request 4gb of memory on a single node with 6 threads, but these requests are ignored. They are not listed on 'qstat -R' and more simultaneous jobs than possible are started if the requirements were taken into account. What am I missing here? Best regards, Geert Vandeweyer ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/