Maybe it helps if you modify the [Unit] part of slurmd.service as follows:

After=network.target network-online.target munge.service
Requires=network.target network-online.target munge.service

If this is not sufficient, you might further try:

After=network.target remote-fs.target munge.service
Requires=network.target remote-fs.target munge.service

2017-03-07 2:26 GMT+01:00 Jianwen Wei <[email protected]>:
> Dear SLURM developers,
>
> We encountered similar issues reported by Tingyang Xu in "slurm cannot work
> with Infiniband after rebooting".   More details can be found on his posts
> on SLURM and Intel forums.
>
> https://groups.google.com/forum/#!searchin/slurm-devel/slurm$20cannot$20work$20with$20Infiniband$20after$20rebooting%7Csort:relevance/slurm-devel/GUeOOlaayLk/OsvdTAsRtdsJ
> https://software.intel.com/en-us/forums/intel-clusters-and-hpc-technology/topic/534491
>
> A workaround is to restart the slurmd service:
>
> # systemclt restart slurmd
>
> As indicated in the Intel Forum, this issue may be caused  by Infiniband's
> being unavailability when SLURM starts. Do you have any recommendation to
> put SLURM as the last service to start when rebooting the host?
>
>
> Best,
>
> Jianwen

Reply via email to