So we are just starting to explore node sharing with slurm.  A question has 
come up as we are looking at using CR_Core_Memory for SelectTypeParameters.  
The key thing is that this moves us into scheduling for memory as before we 
were handing out whole nodes and users just got everything a node had to offer. 
 So now as we need to start to schedule for RAM I'm left wondering a few things 
based on the following conditions:

-Not all of our nodes have the same Core Count or the same RAM Per Core 
(TotalRAM/Cores) Capacity.
-This is both true across a whole cluster and even at times within a partition.

So my initial feeling is to choose a default ram size setting for each 
partition that represents the low common value of RAM/Cores.  So that seems 
fine for a default when a user does not specify.  It also seems as is when a 
use does specify a specific memory target.  What I'm trying to explore/ask here 
though is what if a user really just wants to submit a job and get a "whole 
node"?  Is there a way to articulate that so that slurm understands that 
whatever node is allocated, simply allocate all available RAM that node has to 
that job and have the cgroup support set things up accordingly?



Let me give an example of why I'm wondering this.  If we have a cluster that 
has 3 node types in it.  First are 12core nodes with 24GB of ram.  Next are 
20core nodes with 64GB of ram.  Last are 24core nodes with 256GB of ram.

The group that funded this cluster often runs on specific node types for some 
jobs.  When they do this I don't really see any complication for having memory 
scheduled/allocated as they will already be specifying what node type they 
want.  The usage model that I'm struggling with though is that at times they 
want to run on N cores.  They do not want to be specific, they just want to 
submit based on a core count.  With the above mixed node types the question 
comes up to how should they then specify the memory requirement.  Clearly they 
can use --mem-per-cpu, but with that being common they would have to use the 
low common denominator, which in the above setup would be 2GB.  So where I'm 
fishing here is if there is a way to say either  --mem=ALL or 
--mem="ALL/NodeCores"?

The first would simply ask for all available ram in whatever node is assigned 
at a whole node level and the latter would simply try to factor in the 
crude/simplistic value of whatever RAM a node has divided by the Cores that 
same node has.

What I'm trying to avoid is making things overly complex for the user when the 
job they are running has less defined requirements.

Happy to take any feedback on this as I imagine some of my thought line here is 
simply a reaction to my looking at node sharing for a first time and maybe I 
just need to grow up a bit and deal with having to train users to be more 
specific and accurate. ;)  Thanks!


--
Brian D. Haymore
University of Utah
Center for High Performance Computing
155 South 1452 East RM 405
Salt Lake City, Ut 84112
Phone: 801-558-1150, Fax: 801-585-5366
http://bit.ly/1HO1N2C

Reply via email to