Re: [galaxy-dev] Dynamic job runner status? Could I use it in combination with job splitting?

2012-11-05 Thread Oleksandr Moskalenko

On Nov 5, 2012, at 2:15 PM, Carlos Borroto  wrote:

> On Fri, Nov 2, 2012 at 6:03 PM, Oleksandr Moskalenko  wrote:
>> Here is the blastn rule procedure code and the relevant snippet of the 
>> default runner procedure. I just added the database based multiplier, so 
>> this part is very simple at the moment. I just set a bogus multiplier of "4" 
>> for the "nt_*" databases as an example.
> 
> Hi Alex,
> 
> This is great and definitely helped me get going!. I found a few
> issues related to my local configuration. Like I'm using
> ncbi_blastn_wrapper that was migrated to the tool_shed, so I had to
> use:
> elif 'ncbi_tblastn_wrapper' in tool_id_src
> 
> Instead of:
> elif tool_id_src.startswith('ncbi_blastn_wrapper'):
> 
> The id for the tool from the shed_tool is:
> toolshed.g2.bx.psu.edu/repos/devteam/ncbi_blast_plus/ncbi_blastn_wrapper/0.0.13
> 
> Hopefully this won't break later on.

I hear you. I though of using regexes to avoid that sort of breakage, but 
instead started using the output of "tool_id_src = tool_id.split("/")[-2]" for 
my startswith matches. It works well enough.

> I also need to go back a do a better configuration of our local grid
> engine( using SGE ), as I did only a very bare bone installation and
> I'm running into this error:
> DeniedByDrmException: code 17: unknown resource "nodes"
> 
> Which I realize is a configuration issue in my SGE.

Right, my code is Torque/MOAB specific. You need to rewrite the resource 
requests to use "pe", "slots", "h_vmem", and "h_rt" instead depending on your 
GE setup. It's an easy change.

> Last, and this was my mistake. I didn't initially realize that this
> example you share here, assumes all tools will call default_runner(),
> which in turn will call an specific function to figure out the drmaa
> options to set. I was trying to use the lines from your previous
> email:
> ncbi_blastn_wrapper = dynamic:///python/ncbi_blastn
> ncbi_blastp_wrapper = dynamic:///python/ncbi_blastp

Yes, I switched from using separate runner lines to routing the jobs from the 
default runner as a more flexible approach.

> But note to myself and anybody else following the thread, the function
> being called from universe_wsgi.ini needs to return a proper drmaa/pbs
> url.
> 
> Thanks again!,
> Carlos

Also, the newest error handling code depends on the changeset John put up in a 
merge request into galaxy-central.

Regards,

Alex
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Dynamic job runner status? Could I use it in combination with job splitting?

2012-11-05 Thread Carlos Borroto
On Fri, Nov 2, 2012 at 6:03 PM, Oleksandr Moskalenko  wrote:
> Here is the blastn rule procedure code and the relevant snippet of the 
> default runner procedure. I just added the database based multiplier, so this 
> part is very simple at the moment. I just set a bogus multiplier of "4" for 
> the "nt_*" databases as an example.
>
>
> def ncbi_blastn(job):
> nodes = 1
> ppn = 4
> walltime='167:00:00'
> inp_data = dict( [ ( da.name, da.dataset ) for da in job.input_datasets ] 
> )
> inp_data.update( [ ( da.name, da.dataset ) for da in 
> job.input_library_datasets ] )
> query_file = inp_data[ "query" ].file_name
> query_size = os.path.getsize( query_file )
> inp_params = dict( [ ( da.name, da.value ) for da in job.parameters ] )
> inp_params.update( [ ( da.name, da.value ) for da in job.parameters ] )
> db_dict = eval(inp_params['db_opts'])
> db = db_dict['database']
> db_multiplier = 1
> if db.startswith('nt'):
> db_multiplier = 4
> if query_size <= 20 * 1024 * 1024:
> pmem = 500
> pmem_unit = 'mb'
> elif query_size > 20 * 1024 * 1024 and query_size <= 50 * 1024 * 1024:
> pmem = 750
> pmem_unit = 'mb'
> elif query_size > 50 * 1024 * 1024 and query_size <= 100 * 1024 * 1024:
> pmem = 1500
> pmem_unit = 'mb'
> elif query_size > 100 * 1024 * 1024 and query_size <= 500 * 1024 * 1024:
> pmem = 2
> pmem_unit = 'gb'
> elif query_size > 500 * 1024 * 1024 and query_size <= 1000 * 1024 * 1024:
> pmem = 4
> pmem_unit = 'gb'
> elif query_size > 1000 * 1024 * 1024 and query_size <= 2000 * 1024 * 1024:
> pmem = 10
> pmem_unit = 'gb'
> elif query_size > 2000 * 1024 * 1024:
> pmem = 20
> pmem_unit = 'gb'
> log.debug('OM: blastn query size is in the bigmem category: %skb\n' % 
> (query_size))
> else:
> pmem = 5
> pmem_unit = 'gb'
> if db_multiplier > 1:
> pmem = int(pmem * db_multiplier)
> pmem_str = "%d%s" % (pmem, pmem_unit)
> log.debug('OM: blastn query: %skb, db: %s, pmem: %s\n' % (query_size, db, 
> pmem_str))
> return {'nodes':nodes,'ppn':ppn,'pmem':pmem_str,'walltime':walltime}
>
> def default_runner(tool_id, job):
> ...
> elif tool_id_src.startswith('ncbi_blastn_wrapper'):
> request = ncbi_blastn(job)
> ...
> drmaa = 'drmaa://%s%s%s/' % (queue_str, group_str, request_str)
> return drmaa
>
>

Hi Alex,

This is great and definitely helped me get going!. I found a few
issues related to my local configuration. Like I'm using
ncbi_blastn_wrapper that was migrated to the tool_shed, so I had to
use:
elif 'ncbi_tblastn_wrapper' in tool_id_src

Instead of:
elif tool_id_src.startswith('ncbi_blastn_wrapper'):

The id for the tool from the shed_tool is:
toolshed.g2.bx.psu.edu/repos/devteam/ncbi_blast_plus/ncbi_blastn_wrapper/0.0.13

Hopefully this won't break later on.

I also need to go back a do a better configuration of our local grid
engine( using SGE ), as I did only a very bare bone installation and
I'm running into this error:
DeniedByDrmException: code 17: unknown resource "nodes"

Which I realize is a configuration issue in my SGE.

Last, and this was my mistake. I didn't initially realize that this
example you share here, assumes all tools will call default_runner(),
which in turn will call an specific function to figure out the drmaa
options to set. I was trying to use the lines from your previous
email:
ncbi_blastn_wrapper = dynamic:///python/ncbi_blastn
ncbi_blastp_wrapper = dynamic:///python/ncbi_blastp

But note to myself and anybody else following the thread, the function
being called from universe_wsgi.ini needs to return a proper drmaa/pbs
url.

Thanks again!,
Carlos

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Dynamic job runner status? Could I use it in combination with job splitting?

2012-11-02 Thread Oleksandr Moskalenko

On Nov 2, 2012, at 3:58 PM, Carlos Borroto  wrote:

> 
> Please, could you share your rules related to blast. I would love to
> take a look at them.
> 
> Thanks,
> Carlos

Here is the blastn rule procedure code and the relevant snippet of the default 
runner procedure. I just added the database based multiplier, so this part is 
very simple at the moment. I just set a bogus multiplier of "4" for the "nt_*" 
databases as an example.


def ncbi_blastn(job):
nodes = 1
ppn = 4
walltime='167:00:00'
inp_data = dict( [ ( da.name, da.dataset ) for da in job.input_datasets ] )
inp_data.update( [ ( da.name, da.dataset ) for da in 
job.input_library_datasets ] )
query_file = inp_data[ "query" ].file_name
query_size = os.path.getsize( query_file )
inp_params = dict( [ ( da.name, da.value ) for da in job.parameters ] )
inp_params.update( [ ( da.name, da.value ) for da in job.parameters ] )
db_dict = eval(inp_params['db_opts'])
db = db_dict['database']
db_multiplier = 1
if db.startswith('nt'):
db_multiplier = 4
if query_size <= 20 * 1024 * 1024:
pmem = 500
pmem_unit = 'mb'
elif query_size > 20 * 1024 * 1024 and query_size <= 50 * 1024 * 1024:
pmem = 750
pmem_unit = 'mb'
elif query_size > 50 * 1024 * 1024 and query_size <= 100 * 1024 * 1024:
pmem = 1500
pmem_unit = 'mb'
elif query_size > 100 * 1024 * 1024 and query_size <= 500 * 1024 * 1024:
pmem = 2
pmem_unit = 'gb'
elif query_size > 500 * 1024 * 1024 and query_size <= 1000 * 1024 * 1024:
pmem = 4
pmem_unit = 'gb'
elif query_size > 1000 * 1024 * 1024 and query_size <= 2000 * 1024 * 1024:
pmem = 10
pmem_unit = 'gb'
elif query_size > 2000 * 1024 * 1024:
pmem = 20
pmem_unit = 'gb'
log.debug('OM: blastn query size is in the bigmem category: %skb\n' % 
(query_size))
else:
pmem = 5
pmem_unit = 'gb'
if db_multiplier > 1:
pmem = int(pmem * db_multiplier)
pmem_str = "%d%s" % (pmem, pmem_unit)
log.debug('OM: blastn query: %skb, db: %s, pmem: %s\n' % (query_size, db, 
pmem_str))
return {'nodes':nodes,'ppn':ppn,'pmem':pmem_str,'walltime':walltime}

def default_runner(tool_id, job):
...
elif tool_id_src.startswith('ncbi_blastn_wrapper'):
request = ncbi_blastn(job)
...
drmaa = 'drmaa://%s%s%s/' % (queue_str, group_str, request_str)
return drmaa


Regards,

Alex
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Dynamic job runner status? Could I use it in combination with job splitting?

2012-11-02 Thread Oleksandr Moskalenko

On Nov 2, 2012, at 4:42 PM, Carlos Borroto  wrote:

> On Fri, Nov 2, 2012 at 3:58 PM, Carlos Borroto  
> wrote:
>> On Fri, Nov 2, 2012 at 9:53 AM, Oleksandr Moskalenko  
>> wrote:
>>> 
>>> The dynamic runner is alive and well thanks to the continuous work of the 
>>> author John Chilton and others. For instance, yesterday John committed code 
>>> to galaxy-central that propagates tracebacks from the dynamic job runner's 
>>> job rules script to the logs and also allows us to raise explicit 
>>> exceptions from the job rules script thus making it possible to fail a job 
>>> and print an arbitrary message in the job history's error message as well 
>>> as making it much easier to debug the job rules.
>>> 
>> 
>> Great!.
> 
> First bump:
> 
> galaxy.jobs.handler DEBUG 2012-11-02 16:37:00,728 Loaded job runner:
> galaxy.jobs.runners.drmaa:DRMAAJobRunner
> galaxy.jobs.handler ERROR 2012-11-02 16:29:00,415 Job runner is not
> loadable: galaxy.jobs.runners.dynamic
> Traceback (most recent call last):
>  File "/local/opt/galaxy/galaxy-dist/lib/galaxy/jobs/handler.py",
> line 375, in _load_plugin
>module = __import__( module_name )
> ImportError: No module named dynamic
> 
> In my universe_wsgi.ini:
> 
> start_job_runners = drmaa,dynamic
> 
> I get the same error with -dist and -central. Alex, are you sure
> dynamic job runner is already included? Should I pull something from
> somewhere else?

Hi Carlos,

Dynamic job runner is a layer on top of either the pbs or the drmaa runner that 
actually submits the job:

start_job_runners = drmaa
default_cluster_job_runner = dynamic:///python/default_runner

where "default_runner" is a procedure you define in the job rules script.

Then, you can set the correspondence per tool as in 

upload1 = dynamic:///python/upload1
ncbi_blastn_wrapper = dynamic:///python/ncbi_blastn
ncbi_blastp_wrapper = dynamic:///python/ncbi_blastp

and so on

or just figure out the tool_id in the default_runner procedure.

Regards,

Alex
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Dynamic job runner status? Could I use it in combination with job splitting?

2012-11-02 Thread Carlos Borroto
On Fri, Nov 2, 2012 at 3:58 PM, Carlos Borroto  wrote:
> On Fri, Nov 2, 2012 at 9:53 AM, Oleksandr Moskalenko  wrote:
>>
>> The dynamic runner is alive and well thanks to the continuous work of the 
>> author John Chilton and others. For instance, yesterday John committed code 
>> to galaxy-central that propagates tracebacks from the dynamic job runner's 
>> job rules script to the logs and also allows us to raise explicit exceptions 
>> from the job rules script thus making it possible to fail a job and print an 
>> arbitrary message in the job history's error message as well as making it 
>> much easier to debug the job rules.
>>
>
> Great!.

First bump:

galaxy.jobs.handler DEBUG 2012-11-02 16:37:00,728 Loaded job runner:
galaxy.jobs.runners.drmaa:DRMAAJobRunner
galaxy.jobs.handler ERROR 2012-11-02 16:29:00,415 Job runner is not
loadable: galaxy.jobs.runners.dynamic
Traceback (most recent call last):
  File "/local/opt/galaxy/galaxy-dist/lib/galaxy/jobs/handler.py",
line 375, in _load_plugin
module = __import__( module_name )
ImportError: No module named dynamic

In my universe_wsgi.ini:

start_job_runners = drmaa,dynamic

I get the same error with -dist and -central. Alex, are you sure
dynamic job runner is already included? Should I pull something from
somewhere else?

Thanks,
Carlos

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Dynamic job runner status? Could I use it in combination with job splitting?

2012-11-02 Thread Carlos Borroto
Hi Alex,

On Fri, Nov 2, 2012 at 9:53 AM, Oleksandr Moskalenko  wrote:
>
> The dynamic runner is alive and well thanks to the continuous work of the 
> author John Chilton and others. For instance, yesterday John committed code 
> to galaxy-central that propagates tracebacks from the dynamic job runner's 
> job rules script to the logs and also allows us to raise explicit exceptions 
> from the job rules script thus making it possible to fail a job and print an 
> arbitrary message in the job history's error message as well as making it 
> much easier to debug the job rules.
>

Great!.

> John's information is right on spot. I can also share my rules script with 
> you. I've completely switched our production instance to the dynamic runner. 
> It allows me to set the resource requests based on datasets (blast query at 
> the moment, but I'm getting to the point of using the database as well), 
> resources allocated to the user's group or a secondary group and so on as we 
> well as check internal ACLs for tool usage restrictions and such and send 
> jobs to specific queues or compute nodes as necessary.
>
> I think your goals will inevitably bring you to the dynamic runner and it's 
> not a bad thing.
>

Please, could you share your rules related to blast. I would love to
take a look at them.

Thanks,
Carlos

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Dynamic job runner status? Could I use it in combination with job splitting?

2012-11-02 Thread Oleksandr Moskalenko

On Nov 1, 2012, at 4:22 PM, Carlos Borroto  wrote:

> Hi,
> 
> I've been researching the possibility of using dynamic job runner in
> combination with job splitting for blast jobs. My main interest is to
> create a rule where both the size of the query and the database are
> taken into consideration in order to select DRMAA and splitting
> options.
> 
> My first question is what is the status on dynamic job runner? I found
> these two threads, but is not clear to me if this feature is part of
> galaxy-dist already:
> http://lists.bx.psu.edu/pipermail/galaxy-dev/2011-October/007160.html
> http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-June/010080.html

The dynamic runner is alive and well thanks to the continuous work of the 
author John Chilton and others. For instance, yesterday John committed code to 
galaxy-central that propagates tracebacks from the dynamic job runner's job 
rules script to the logs and also allows us to raise explicit exceptions from 
the job rules script thus making it possible to fail a job and print an 
arbitrary message in the job history's error message as well as making it much 
easier to debug the job rules.

> My second question is if there is any documentation other than the
> threads above to configure something like what I describe? In any
> case, there is very good information from John in these emails and I
> think that should get me started at least.

John's information is right on spot. I can also share my rules script with you. 
I've completely switched our production instance to the dynamic runner. It 
allows me to set the resource requests based on datasets (blast query at the 
moment, but I'm getting to the point of using the database as well), resources 
allocated to the user's group or a secondary group and so on as we well as 
check internal ACLs for tool usage restrictions and such and send jobs to 
specific queues or compute nodes as necessary.

I think your goals will inevitably bring you to the dynamic runner and it's not 
a bad thing. 

The only large obstacle in the job handling area as I perceive it is the lack 
of a generic way to plug some  statements into any tool's interface 
without changing all tool definition files, but that's beyond the dynamic 
runner and is something for the Galaxy Core Team to consider.

Cheers,

Alex
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/