Exactly. The easiest way is just to underreport the amount of memory in
slurm. That way slurm will take care of it natively. We do this here as
well even though we have disks in order to make sure the OS has memory
left to run.
-Paul Edmon-
On 3/14/19 8:36 AM, Doug Meyer wrote:
We also run
We also run diskless. In the slurm.conf we round down on memory so slurm
does not have the total budget to work with and use a default memory per
job value reflecting declared memory/# of threads per node. If users don't
declarememory limit we are fine. If they declare more we are fine too.
Mostly
Hello Paul,
Thank you for your advice. That all makes sense. We're running diskless
compute nodes and so the usable memory is less than the total memory. So I
have added a memory check to my job_submit.lua -- see below. I think that
all makes sense.
Best regards,
David
-- Check memory/node is va
Slurm should automatically block or reject jobs that can't run on that
partition in terms of memory usage for a single node. So you shouldn't
need to do anything. If you need something less than the max memory per
node then you will need to enforce some limits. We do this via a
jobsubmit lua
Hello,
I have set up a serial queue to run small jobs in the cluster. Actually, I
route jobs to this queue using the job_submit.lua script. Any 1 node job using
up to 20 cpus is routed to this queue, unless a user submits their job with an
exclusive flag.
The partition is shared and so I def