On Sun, Jun 10, 2012 at 7:33 AM, John Chilton <chil...@msi.umn.edu> wrote:
> My previous e-mail was a technical description of how it worked, I
> think maybe that is why it didn't generate the excitement I had hoped
> :). I think instead describing some concrete use cases might be
> better. So here are six cool things you can do with dynamic job
> runners.
>
> ...
>
> 1) Change maximum walltime based on job parameters or file sizes.
>
> Lets say you want to change the max walltime of the BlastN based on
> the size of the input query. First you would add the line
> ncbi_blastn_wrapper=dynamic:///python to universe_wsgi.ini. Next in
> 200_runners.py you would add a function such as following:
>
> import os
>
> def ncbi_blastn_wrapper(job):
>  inp_data = dict( [ ( da.name, da.dataset ) for da in
> job.input_datasets ] ) inp_data.update( [ ( da.name, da.dataset ) for
> da in job.input_library_datasets ] )
>  query_file = inp_data[ "query" ].file_name
>  query_size = os.path.getsize( query_file )
>  if query_size > 1024 * 1024:
>    return 'pbs:////-l walltime=24:00:00/'
>  else:
>    return 'pbs:////-l walltime=12:00:00/'

So these wall time estimates are in a separate file from the tool
wrapper - that seems a good idea as they will depend on the
local cluster node power. And they can be elaborated on as
needed (e.g. for BLAST, consider both the number of query
sequences and the number of subject sequences - i.e. the
database size).

Presumably the exact same approach could handle this:

(7) Change job priority or queue depending on job details.

Potentially memory intensive tasks like assembly, jobs could
be allocated to a big memory queue if the input read count
is large, or allocated to the normal (lower) memory queue
for smaller jobs like bacteria or viruses.

Or, in a slight variation to your wall time snippet, the code
could (also) specify big jobs go in the low priority queue
while small jobs go in the high priority queue - either using
named queues or priority settings depending on the
cluster setup. That was something I was hoping to do:
http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-June/009962.html

With a heterogeneous cluster setup this sort of things
would be very helpful.

If Nate is positive about including your work soon, that
is very good news :)

Peter

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to