An explanation of the problem and fix will be in version 2.6.2. The commit with a fix is here:
https://github.com/SchedMD/slurm/commit/98e24b0dedc7af8b00c06ceaf99e67dbc6a1b5f2

Moe Jette
SchedMD LLC
Slurm commercial support and development

Quoting Magnus Jonsson <[email protected]>:

Hi!

A user reported a strange behaviour of squeue in our newly installed 2.6.1 version.

The following submit file:
-----8<-----
#!/bin/bash -l

#SBATCH -N 1
#SBATCH -n 12
#SBATCH --time=5-00:00:00

hostname
-----8<-----

Results in the following output from squeue if I use -j or -u:

-----8<-----
% squeue -u magnus
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 1190903 batch submit2 magnus PD 0:00 12 (Priority)
-----8<-----

-----8<-----
% squeue -j 1190903
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 1190903 batch submit2 magnus PD 0:00 12 (Priority)
-----8<-----

But if I use grep:

-----8<-----
% squeue | grep 1190903
1190903 batch submit2 magnus PD 0:00 1 (Priority)
-----8<-----

I get the expected behaviour.

I have tracked this down to a commit to "To minimize overhead"

https://github.com/SchedMD/slurm/commit/ac44db862c8d1f460e55ad09017d058942ff6499

on line 397/416 in src/squeue/opts.c. max_cpus is used in the _get_node_cnt() to estimate the number of nodes required.

Reverting params.max_cpus code I get the expected behaviour.

Best regards,
Magnus

--
Magnus Jonsson, Developer, HPC2N, UmeƄ Universitet




Reply via email to