Martin,

We are in  a similar situation.  What we have done both successfully and 
unsuccessfully on some clusters is to explain and get approval that it 
is CPU time that gets allocated and not the hardware itself.  If you can 
get this agreement then fairshare with groups getting different priority 
seems to work just fine.

In cases where groups need access to their "own" nodes, I have 
implemented this using the nodes file and assigning a value to that then 
use separate queues with the neednodes attribute set.

node001 np=8 all chem
node002 np=8 all chem
node003 np=8 all physics
node004 np=8 all physics
node005 np=8 physics chem
etc -- noting that this automatically reserves some nodes for chem and 
physics exclusively.

then queue short
s q short  resources_default.neednodes =  all

Similarly,
s q physics  resources_default.neednodes = physics

Users then need to be disciplined into specifying a certain queue with 
their jobs, otherwise they land in short, the default.  Now you could 
still set up short, medium, large but constrain the limits of how many 
of each to have available resources for those who need them.  Constant 
tinkering but an approach I use.

Does this make any sense?

Bill

On 2/3/2011 8:33 PM, Martin Thompson wrote:
> Hi there
>
> I am trying (and failing) to implement a scheduling policy and I'd
> appreciate your advice on how it could be done.  Below I have tried to
> construct the simplest example that includes the key components of the
> policy...
>
> The compute nodes belong to three groups: science, physics and
> chemistry.  Everyone in physics and chemistry also belong to science,
> but there are also science users who do not belong to physics or
> chemistry, e.g biology.  The background story is that the faculty of
> science bought the original cluster, and physics and chemistry have
> extended it with their own funds.
>
> So that we can apply constraints based on job run-time we usually have
> three queues: short, medium and long.
>
> Short jobs can run on any compute node regardless of the user's group.
> Medium and long jobs can only run on compute nodes affiliated with the
> user.  For example, a chemist can run short jobs anywhere, and medium
> and long jobs on science or chemistry compute nodes.
>
> We'd also like jobs to prefer compute nodes belonging to the user's own
> group.  For example, a short chemistry job should consider chemistry,
> science and physics compute nodes in that order.
>
> And finally, we're interested in relaxing some constraints when a user
> runs a job on their own compute nodes.  For example, perhaps the maximum
> run-time in the 'long' queue is 100 hours, but physics would like to run
> jobs with run-times of 200 hours on their own compute nodes.  Perhaps
> this would involve an additional queue for each group that requires very
> long jobs.  This facility would not be offered to the science group.
>
> I have tried to implement this policy with a combination of queues
> defined in Torque that classify jobs based on run-time, and standing
> reservations in Maui that control which groups have access to the
> various compute nodes.  However, I am struggling to map the rules I have
> just described to the access control mechanisms available in standing
> reservations.  I'd like to do something like...
>
> SRCFG[physics1] CLASSLIST=short,medium,long,physicsverylong
> SRCFG[physics1] GROUPLIST=physics+
>
> SRCFG[physics2] CLASSLIST=short-
> SRCFG[physics2] GROUPLIST=!physics
>
> ...where CLASSLIST and GROUPLIST must both be satisfied within each
> standing reservation, and either of the two standing reservations must
> be satisfied.  However, if I understand correctly, it is only possible
> to do the exact opposite: CLASSLIST || GROUPLIST and SR&&  SR.
>
> So I suspect that I need a completely different approach.
>
> I'd be very grateful if anyone can offer advice on any part of this
> puzzle.
>
> Many thanks
>
> Martin
>
>
> _______________________________________________
> mauiusers mailing list
> [email protected]
> http://www.supercluster.org/mailman/listinfo/mauiusers

_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to