See:
http://slurm.schedmd.com/quickstart_admin.html#upgrade
Quoting Всеволод Никоноров <[email protected]>:
Hello,
I tried to test slurm-14.11 on some of my nodes while other nodes
ran slurm-2.5.7, and nodes running 14.11 were not excluded from
2.5.7 controller config. It seems like something confused 2.5.7
controller, for tasks have doubled for some time (each task were
visible twice in smap list), and after excluding 14.11 nodes from
2.5.7 controller config those tasks have restarted and doubling has
ended.
Can protocol mismatch (which was definitely visible in log) be
related to task doubling and hanging? Are there any other safety
measures except cross-excluding foreign-version nodes from
controllers? I don't want to make our polite users sad again :)
Thanks in advance!
--
Morris "Moe" Jette
CTO, SchedMD LLC
Slurm User Group Meeting
September 23-24, Lugano, Switzerland
Find out more http://slurm.schedmd.com/slurm_ug_agenda.html