On 05/07/2015 05:24 PM, Moe Jette wrote:
Configure a DefMemPerCPU.
The default memory size for a job is all memory on the node, which was
previously not accounted for properly to prevent memory oversubscription.
Thank you very much for your quick answer!
That fixed the problem for all new
Thank you for your contribution to Slurm. A slight variation of your
patch will be in version 14.11.7 when released in late May. The commit
is here:
https://github.com/SchedMD/slurm/commit/bf81e826a2f7bc752d3239ea724e35ce2867a052
Quoting Jonathon Nelson jdnel...@dyn.com:
slurmstepd/task.c
slurmstepd/task.c does not properly save errno and may overwrite it before
using %m in error messages.
diff --git a/src/slurmd/slurmstepd/task.c b/src/slurmd/slurmstepd/task.c
index 186b6ad..a24d997 100644
--- a/src/slurmd/slurmstepd/task.c
+++ b/src/slurmd/slurmstepd/task.c
@@ -367,6 +367,7 @@
Trevor,
It does depend a bit on the configuration of your cluster, however it sounds
like what you need to do is create a job submission file that requests enough
resources for one of your jobs and then submit them as an array. Read the man
page on sbatch to determine what switches you need
Hello everyone,
I’m developing a piece of software that runs fault injection simulations on a
cluster running slurm and am trying to figure out the ideal method for
launching a potentially massive amount of jobs. I’m not very familiar with
slurm, and had a questions about how slurm allocates
This is how I recommend to users of our cluster to pass variable parameters to
a slurm array.
In your batch file set a variable that reads a line of a data file for
parameters.
e.g.
PARAMETERS=$(awk -v line=${SLURM_ARRAY_TASK_ID} '{if (NR == line) { print $0;
};}' ./data.dat)
What this does
http://slurm.schedmd.com/pdfs/LCS_cgroups_BULL.pdf
was an interesting read. I'd assume that cgroup/devices subsystem is fully
functioning now and GPUs and Phi are supported.
But what about i/o and network usage limits, monitoring and reporting? NFS or
Lustre systems? Energy?
Dr Igor Kozin |