Am Thu, 10 Mar 2016 00:19:16 -0800 schrieb Rémi Palancher <[email protected]>:
> That's exactly the purpose of the patch since we were facing the same > issue with IB and GPFS. I also mucked around with mount dependencies to the slurmd systemd service in CentOS 7, but that did not work out well for BeeGFS, which systemd doesn't really know about in terms of how the mount appears. Also, BeeGFS can succeed to mount before Infiniband is active, as it happily uses an existing Ethernet connection as alternative. I ended up doing explicit waiting for links being up on eth0 and ib0 in an xCAT postscript and having the slurmd service disabled by default. After everything is there, the boot script starts slurmd via systemctl explicitly. At that point in time, jobs really can be accepted. If you want to rely on systemd's automagic to start all stuff at the same time and emulate presence of services that are still starting up, including slurmd, this sanity check inside slurmd might be the only way to ensure something resembling a sane "booted" state for a node, until those checks and dependencies really work inside systemd (including GPFS and BeeGFS mounts). But it is not unreasonable to have those configuration and health checks separately, UNIX-style. Alrighty then, Thomas -- Dr. Thomas Orgis Universität Hamburg RRZ / Zentrale Dienste / HPC Schlüterstr. 70 20146 Hamburg Tel.: 040/42838 8826 Fax: 040/428 38 6270
smime.p7s
Description: S/MIME cryptographic signature
