Dear list,
In order to avoid issues when more than one job is hosted by a node (20
cores per node, PSM InfiniBand), I need to ensure that
PSM_RANKS_PER_CONTEXT is set to 2 whenever a job starts. I then decided
to export the variable in a TaskProlog.
Here is what I plan to set in the prolog :
if [[ ${HOSTNAME:0:8} == "<name>" ]] #Only some of the nodes are based
on PSM IB
then
echo "export PSM_RANKS_PER_CONTEXT=2" #Seems to be the only way to
inherit the variable in the user's job environment
exit 0
fi
Before introducing these lines in the TaskProlog, I've tested it that
way :
srun --task-prolog=slurm.prolog env
The fix seems to work well.
Is this the right way to do the thing? Would you have a better / more
elegant solution?
--
DANY TELLO