On Jan 6, 2011, at 12:35 PM, James Taylor wrote:

> Two parts to this probably.
> 
> 1) It should definitely be possible to have parameters in tool configs  
> that are set in a global configuration file, I actually thought  
> tool_conf.xml might be a good place (inside the tool element).
> 
> 2) For the particular case of processor cores, ideally we would be  
> able to have the batch management system set this information (if  
> running on an 8 core node, use 8).

I'm not sure of a cross platform way to figure this out, and I admit this is 
kind of a hack, but if running in a TORQUE batch environment at run time a tool 
wrapper could parse the file specified in $PBS_NODEFILE to figure out how many 
threads it should be using (how many have been allocated to the job).  This 
would require a wrapper script that parses this and then passes along the right 
flag on the command line to instruct the executable to use the right number of 
cores (not necessarily how many are on the node, but how many are allocated by 
the batch system to the job).   MPI jobs are easier if the MPI implementation 
integrates with the batch system, since by default they will run on all the 
nodes/cores allocated to the job.  It would be possible to write a script that 
knew how to do this for a variety of batch systems and could auto detect which 
environment it is running in.  This could then be used to plug in the value for 
a command line switch specifying a number of th!
 reads for the actual executable.

But then there still is the problem of requesting the proper number of nodes 
and cores per node from the batch system.  For our local setup we have a 
default pbs job runner that submits to the default server and default queue and 
requests a single core for some upper limit of time:

 pbs:////-l nodes=1:ppn=1,walltime=HH:MM:SS

and then for every threaded tool we specify a job runner specific for that 
tool, like this

bowtie_wrapper = pbs:////-l nodes=1:ppn=N,walltime=HH:MM:SS/

where in our case N <= 32 (32 cores per node)

I would be nice to have a parameterized job runner where the tool itself had 
some control over the number of nodes and ppn it requested, but there could be 
system specified bounds on the values.

A runner specified like pbs:////-l 
nodes=${NODES}:ppn=${CORES_PER_NODE},walltime=HH:MM:SS/ where galaxy knew how 
to fill in NODES and CORES_PER_NODE based on information in the tool 
configuration and system specified limits would be nice.  Then we wouldn't need 
to define a new job runner for any tool that we don't want using our default 
single node/core runner.








> 
> On Jan 6, 2011, at 5:10 AM, Peter wrote:
> 
>> Some of the provided Galaxy wrappers have a note in the XML
>> saying the number of threads should be configurable, perhaps
>> via a loc file. I have suggested to the Galaxy developers there
>> should be a general setting for number of threads per tool
>> accessible via the XML, so that this can be configured centrally
>> (maybe I should file an enhancement issue for this):
> 
> _______________________________________________
> galaxy-dev mailing list
> galaxy-dev@lists.bx.psu.edu
> http://lists.bx.psu.edu/listinfo/galaxy-dev

--
Glen L. Beane
Software Engineer
The Jackson Laboratory
Phone (207) 288-6153




_______________________________________________
galaxy-dev mailing list
galaxy-dev@lists.bx.psu.edu
http://lists.bx.psu.edu/listinfo/galaxy-dev

Reply via email to