So we are just starting to explore node sharing with slurm. A question has come up as we are looking at using CR_Core_Memory for SelectTypeParameters. The key thing is that this moves us into scheduling for memory as before we were handing out whole nodes and users just got everything a node had to offer. So now as we need to start to schedule for RAM I'm left wondering a few things based on the following conditions:
-Not all of our nodes have the same Core Count or the same RAM Per Core (TotalRAM/Cores) Capacity. -This is both true across a whole cluster and even at times within a partition. So my initial feeling is to choose a default ram size setting for each partition that represents the low common value of RAM/Cores. So that seems fine for a default when a user does not specify. It also seems as is when a use does specify a specific memory target. What I'm trying to explore/ask here though is what if a user really just wants to submit a job and get a "whole node"? Is there a way to articulate that so that slurm understands that whatever node is allocated, simply allocate all available RAM that node has to that job and have the cgroup support set things up accordingly? Let me give an example of why I'm wondering this. If we have a cluster that has 3 node types in it. First are 12core nodes with 24GB of ram. Next are 20core nodes with 64GB of ram. Last are 24core nodes with 256GB of ram. The group that funded this cluster often runs on specific node types for some jobs. When they do this I don't really see any complication for having memory scheduled/allocated as they will already be specifying what node type they want. The usage model that I'm struggling with though is that at times they want to run on N cores. They do not want to be specific, they just want to submit based on a core count. With the above mixed node types the question comes up to how should they then specify the memory requirement. Clearly they can use --mem-per-cpu, but with that being common they would have to use the low common denominator, which in the above setup would be 2GB. So where I'm fishing here is if there is a way to say either --mem=ALL or --mem="ALL/NodeCores"? The first would simply ask for all available ram in whatever node is assigned at a whole node level and the latter would simply try to factor in the crude/simplistic value of whatever RAM a node has divided by the Cores that same node has. What I'm trying to avoid is making things overly complex for the user when the job they are running has less defined requirements. Happy to take any feedback on this as I imagine some of my thought line here is simply a reaction to my looking at node sharing for a first time and maybe I just need to grow up a bit and deal with having to train users to be more specific and accurate. ;) Thanks! -- Brian D. Haymore University of Utah Center for High Performance Computing 155 South 1452 East RM 405 Salt Lake City, Ut 84112 Phone: 801-558-1150, Fax: 801-585-5366 http://bit.ly/1HO1N2C