(Since sgather is in contrib, and I found no contact address in it, I
post the report here.)

sgather in slurm 14.04.1 has a bug that is triggered when nodes are set
up with different Nodename and Nodehostname (and hostname(1) returns the
Nodehostname).  Changing

    nodelist=$($SRUN --ntasks=$SLURM_NNODES --ntasks-per-node=1 -l hostname) || 
exit $?
    nodelist=$(echo "$nodelist" | cut -d ' ' -f 2 | sort)

into

    nodelist=$($SCONTROL show hostnames $SLURM_NODELIST | sort)

should fix it (I am not sure if sort is even needed).  It should also be
slightly more efficient.

It would also be nice if the node-global destinations could be
configurable, instead of being hard-coded in the script (or at least be
set at the top of the script).  For instance, on our system, the
node-global file systems are /work and /cluster, not /scratch and /home.

-- 
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo

Reply via email to