Thank you, Lufthanken. I'll try and update my feedbacks in this mail thread.
Best, Jianwen > On 7 Mar 2017, at 16:50, TO_Webmaster <[email protected]> wrote: > > > Maybe it helps if you modify the [Unit] part of slurmd.service as follows: > > After=network.target network-online.target munge.service > Requires=network.target network-online.target munge.service > > If this is not sufficient, you might further try: > > After=network.target remote-fs.target munge.service > Requires=network.target remote-fs.target munge.service > > 2017-03-07 2:26 GMT+01:00 Jianwen Wei <[email protected]>: >> Dear SLURM developers, >> >> We encountered similar issues reported by Tingyang Xu in "slurm cannot work >> with Infiniband after rebooting". More details can be found on his posts >> on SLURM and Intel forums. >> >> https://groups.google.com/forum/#!searchin/slurm-devel/slurm$20cannot$20work$20with$20Infiniband$20after$20rebooting%7Csort:relevance/slurm-devel/GUeOOlaayLk/OsvdTAsRtdsJ >> https://software.intel.com/en-us/forums/intel-clusters-and-hpc-technology/topic/534491 >> >> A workaround is to restart the slurmd service: >> >> # systemclt restart slurmd >> >> As indicated in the Intel Forum, this issue may be caused by Infiniband's >> being unavailability when SLURM starts. Do you have any recommendation to >> put SLURM as the last service to start when rebooting the host? >> >> >> Best, >> >> Jianwen
