Martin, We are in a similar situation. What we have done both successfully and unsuccessfully on some clusters is to explain and get approval that it is CPU time that gets allocated and not the hardware itself. If you can get this agreement then fairshare with groups getting different priority seems to work just fine.
In cases where groups need access to their "own" nodes, I have implemented this using the nodes file and assigning a value to that then use separate queues with the neednodes attribute set. node001 np=8 all chem node002 np=8 all chem node003 np=8 all physics node004 np=8 all physics node005 np=8 physics chem etc -- noting that this automatically reserves some nodes for chem and physics exclusively. then queue short s q short resources_default.neednodes = all Similarly, s q physics resources_default.neednodes = physics Users then need to be disciplined into specifying a certain queue with their jobs, otherwise they land in short, the default. Now you could still set up short, medium, large but constrain the limits of how many of each to have available resources for those who need them. Constant tinkering but an approach I use. Does this make any sense? Bill On 2/3/2011 8:33 PM, Martin Thompson wrote: > Hi there > > I am trying (and failing) to implement a scheduling policy and I'd > appreciate your advice on how it could be done. Below I have tried to > construct the simplest example that includes the key components of the > policy... > > The compute nodes belong to three groups: science, physics and > chemistry. Everyone in physics and chemistry also belong to science, > but there are also science users who do not belong to physics or > chemistry, e.g biology. The background story is that the faculty of > science bought the original cluster, and physics and chemistry have > extended it with their own funds. > > So that we can apply constraints based on job run-time we usually have > three queues: short, medium and long. > > Short jobs can run on any compute node regardless of the user's group. > Medium and long jobs can only run on compute nodes affiliated with the > user. For example, a chemist can run short jobs anywhere, and medium > and long jobs on science or chemistry compute nodes. > > We'd also like jobs to prefer compute nodes belonging to the user's own > group. For example, a short chemistry job should consider chemistry, > science and physics compute nodes in that order. > > And finally, we're interested in relaxing some constraints when a user > runs a job on their own compute nodes. For example, perhaps the maximum > run-time in the 'long' queue is 100 hours, but physics would like to run > jobs with run-times of 200 hours on their own compute nodes. Perhaps > this would involve an additional queue for each group that requires very > long jobs. This facility would not be offered to the science group. > > I have tried to implement this policy with a combination of queues > defined in Torque that classify jobs based on run-time, and standing > reservations in Maui that control which groups have access to the > various compute nodes. However, I am struggling to map the rules I have > just described to the access control mechanisms available in standing > reservations. I'd like to do something like... > > SRCFG[physics1] CLASSLIST=short,medium,long,physicsverylong > SRCFG[physics1] GROUPLIST=physics+ > > SRCFG[physics2] CLASSLIST=short- > SRCFG[physics2] GROUPLIST=!physics > > ...where CLASSLIST and GROUPLIST must both be satisfied within each > standing reservation, and either of the two standing reservations must > be satisfied. However, if I understand correctly, it is only possible > to do the exact opposite: CLASSLIST || GROUPLIST and SR&& SR. > > So I suspect that I need a completely different approach. > > I'd be very grateful if anyone can offer advice on any part of this > puzzle. > > Many thanks > > Martin > > > _______________________________________________ > mauiusers mailing list > [email protected] > http://www.supercluster.org/mailman/listinfo/mauiusers _______________________________________________ mauiusers mailing list [email protected] http://www.supercluster.org/mailman/listinfo/mauiusers
