Hi Louise-Amelie,

Thank you for the post reference, this is exactly what I was looking for.
For us for for example when I want to execute a tool that is a Java command
the JVM typically will typically use multiple cores as it's running.  You
said with TORQUE it will crash when there aren't enough resources when the
job is submitted.  I wonder if you can do the same thing we have done here
with LSF?  With LSF you can configure a maximum server load for each node
and if the submitted jobs push the node load above this threshold (e.g. more
cores requested than available) LSF will temporarily suspend jobs (using
some kind of heuristics) so that the load stays below the threshold and
unsuspend as resources become available.  So for us things just will run
slower when we cannot pass the requested number of cores to LSF.

I would think maybe there is a way with TORQUE to have it achieve the same
thing so jobs don't crash when resources requested are more than available?

regards,
Leandro

2011/5/19 Louise-Amélie Schmitt <louise-amelie.schm...@embl.de>

>  Hi,
>
> In a previous message, I explained how I did to multithreads certain jobs,
> perhaps you can modify the corresponding files for drmaa in a similar way:
>
> On 04/26/2011 11:26 AM, Louise-Amélie Schmitt wrote:
>
> Just one little fix on line 261:
> 261                 if ( len(l) > 1 and l[0] == job_wrapper.tool.id ):
>
> Otherwise it pathetically crashes when non-multithreaded jobs are
> submitted. Sorry about that.
>
> Regards,
> L-A
>
> Le mardi 19 avril 2011 à 14:33 +0200, Louise-Amélie Schmitt a écrit :
>
>  Hello everyone,
>
> I'm using TORQUE with Galaxy, and we noticed that if a tool is
> multithreaded, the number of needed cores is not communicated to pbs,
> leading to job crashes if the required resources are not available when
> the job is submitted.
>
> Therefore I modified a little the code as follows in
> lib/galaxy/jobs/runners/pbs.py
>
> 256         # define PBS job options
> 257         attrs.append( dict( name = pbs.ATTR_N, value = str( "%s_%s_%
> s" % ( job_wrapper.job_id, job_wrapper.tool.id, job_wrapper.user ) ) ) )
> 258         mt_file = open('tool-data/multithreading.csv', 'r')
> 259         for l in mt_file:
> 260                 l = string.split(l)
> 261                 if ( l[0] == job_wrapper.tool.id ):
> 262                         attrs.append( dict( name = pbs.ATTR_l,
> resource = 'nodes', value = '1:ppn='+str(l[1]) ) )
> 263                         attrs.append( dict( name = pbs.ATTR_l,
> resource = 'mem', value = str(l[2]) ) )
> 264                         break
> 265         mt_file.close()
> 266         job_attrs = pbs.new_attropl( len( attrs ) +
> len( pbs_options ) )
>
> (sorry it didn't come out very well due to line breaking)
>
> The csv file contains a list of the multithreaded tools, each line
> containing:
> <tool id>\t<number of threads>\t<memory needed>\n
>
> And it works fine, the jobs wait for their turn properly, but
> information is duplicated. Perhaps there would be a way to include
> something similar in galaxy's original code (if it is not already the
> case, I may not be up-to-date) without duplicating data.
>
> I hope that helps :)
>
> Best regards,
> L-A
>
> ___________________________________________________________
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
>
>   http://lists.bx.psu.edu/listinfo/galaxy-dev
>
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
>
>   http://lists.bx.psu.edu/
>
>
>
> On 05/19/2011 12:03 PM, Leandro Hermida wrote:
>
> Hi,
>
> When Galaxy is configured to use the DRMAA job runner is there a way for a
> tool to tell DRMAA the number of cores it would like to request? The
> equivalent of bsub -n X in LSF where X is min number of cores to have
> available on node.
>
> best,
> leandro
>
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>
>   http://lists.bx.psu.edu/
>
>
>
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to