On Mar 15, 2011, at 9:27 AM, Nate Coraor wrote:

> Ry4an Brase wrote:
>> As use of our Galaxy installation is picking up, we're getting a lot of
>> requests for greater fairness and transparency in the Galaxy job runner
>> area.
>> 
>> As I understand things the primary tool Galaxy gives us to affect
>> processing order and wait times with our torque-based setup is the
>> ability to map specific tools to varying queues or to keep them on a
>> local-runner.
>> 
>> On one end of the spectrum I could see a simple division of
>> small/fast/light jobs on local and big/heavy/slow job on a single
>> cluster queue.  On the other extreme one could set up a queue per tool
>> and use sophisticated queue management stuff on the torque side of
>> things to balance capacity across tools, users, expected processing
>> time, etc.
>> 
>> How are other sites handling this?
> 
> Hi Ry4an,
> 
> I'd prefer to keep most of the scheduling in the DRM (Torque, SGE, etc.)
> since that's what it's designed to do.  That said, we want to make it as
> easy as possible to do this, and Galaxy currently only sort of has the
> ability to do it.  By currently I mean that you can set DRM parameters
> per-tool in the config file.
> 
> There are a couple of pieces that need to exist.  For environments like
> our public site where Galaxy users can't map one-to-one with system
> users, Galaxy itself needs to be able to limit the number of jobs a user
> can run on a particular cluster.  Work on this component is under way.
> 
> On environments where Galaxy users *are* system users, Galaxy needs to
> do things that interact with the system, such as reading files from disk
> for upload, exporting files for download, and submittings cluster jobs
> as the real user.  Writing this is near-ish to the top of my list.
> 
> There's a final piece which we've discussed here quite a few times but
> are not very close to implemeting.  That would be a config language to
> allow Galaxy to make decisions about DRM parameters to set based on
> variables like input size or sequence count, parameters selected, and so
> forth.  A good example of where this is needed is in the mappers, which
> currently have a hardcoded multiprocessor setting of 4 that is almost
> certainly not appropriate for all environments.  Ideally Galaxy would be
> able to decide where to run the job and based on that information, know
> how many threads/processes to start based on the resources the job is
> given.  I'd love to see this also be able to make assumptions about
> runtime so that DRM backfill could be properly employed, but this may
> not be possible since most job runtimes are probably not a calculable
> function of the size of the input data and the selected parameters.
> 
> --nate


are there issues open for these Galaxy changes?  I would like to follow the 
development.


--
Glen L. Beane
Senior Software Engineer
The Jackson Laboratory
(207) 288-6153





___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to