Thanks Olaf, but we don't use NetworkManager on this cluster.. I now created this simple script:
------------------------------------------------------------------------------------------------------------------------------------------------------------- #! /bin/bash - # # Fail mmstartup if not all configured IB ports are active. # # Install with: # # mmaddcallback fail-if-ibfail --command /var/mmfs/etc/fail-if-ibfail --event preStartup --sync --onerror shutdown # for port in $(/usr/lpp/mmfs/bin/mmdiag --config|grep verbsPorts | cut -f 4- -d " ") do grep -q ACTIVE /sys/class/infiniband/${port%/*}/ports/${port##*/}/state || exit 1 done ------------------------------------------------------------------------------------------------------------------------------------------------------------- which I haven't tested, but assume should work. Suggestions for improvements would be much appreciated! -jf On Thu, Mar 15, 2018 at 6:30 PM, Olaf Weiser <olaf.wei...@de.ibm.com> wrote: > > you can try : > systemctl enable NetworkManager-wait-online > ln -s '/usr/lib/systemd/system/NetworkManager-wait-online.service' > '/etc/systemd/system/multi-user.target.wants/NetworkManager-wait-online. > service' > > in many cases .. it helps .. > > > > > > From: Jan-Frode Myklebust <janfr...@tanso.net> > To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> > Date: 03/15/2018 06:18 PM > Subject: Re: [gpfsug-discuss] GPFS autoload - wait for IB ports to > becomeactive > Sent by: gpfsug-discuss-boun...@spectrumscale.org > ------------------------------ > > > > I found some discussion on this at > *https://www.ibm.com/developerworks/community/forums/html/threadTopic?id=77777777-0000-0000-0000-000014471957&ps=25* > <https://www.ibm.com/developerworks/community/forums/html/threadTopic?id=77777777-0000-0000-0000-000014471957&ps=25>and > there it's claimed that none of the callback events are early enough to > resolve this. That we need a pre-preStartup trigger. Any idea if this has > changed -- or is the callback option then only to do a "--onerror > shutdown" if it has failed to connect IB ? > > > On Thu, Mar 8, 2018 at 1:42 PM, Frederick Stock <*sto...@us.ibm.com* > <sto...@us.ibm.com>> wrote: > You could also use the GPFS prestartup callback (mmaddcallback) to execute > a script synchronously that waits for the IB ports to become available > before returning and allowing GPFS to continue. Not systemd integrated but > it should work. > > Fred > __________________________________________________ > Fred Stock | IBM Pittsburgh Lab | *720-430-8821* <(720)%20430-8821> > *sto...@us.ibm.com* <sto...@us.ibm.com> > > > > From: *david_john...@brown.edu* <david_john...@brown.edu> > To: gpfsug main discussion list <*gpfsug-discuss@spectrumscale.org* > <gpfsug-discuss@spectrumscale.org>> > Date: 03/08/2018 07:34 AM > Subject: Re: [gpfsug-discuss] GPFS autoload - wait for IB ports to > become active > Sent by: *gpfsug-discuss-boun...@spectrumscale.org* > <gpfsug-discuss-boun...@spectrumscale.org> > ------------------------------ > > > > > Until IBM provides a solution, here is my workaround. Add it so it runs > before the gpfs script, I call it from our custom xcat diskless boot > scripts. Based on rhel7, not fully systemd integrated. YMMV! > > Regards, > — ddj > ——- > [ddj@storage041 ~]$ cat /etc/init.d/ibready > #! /bin/bash > # > # chkconfig: 2345 06 94 > # /etc/rc.d/init.d/ibready > # written in 2016 David D Johnson (ddj <at> *brown.edu* > <https://urldefense.proofpoint.com/v2/url?u=http-3A__brown.edu&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=u-EMob09-dkE6jZbD3dTjBi3vWhmDXtxiOK3nqFyIgY&s=DZ8S9rlTWQ8XfqHR6o5CWfRorBROzg9akyebO0kFd0M&e=> > ) > # > ### BEGIN INIT INFO > # Provides: ibready > # Required-Start: > # Required-Stop: > # Default-Stop: > # Description: Block until infiniband is ready > # Short-Description: Block until infiniband is ready > ### END INIT INFO > > RETVAL=0 > if [[ -d /sys/class/infiniband ]] > then > IBDEVICE=$(dirname $(grep -il infiniband > /sys/class/infiniband/*/ports/1/link* | head -n 1)) > fi > # See how we were called. > case "$1" in > start) > if [[ -n $IBDEVICE && -f $IBDEVICE/state ]] > then > echo -n "Polling for InfiniBand link up: " > for (( count = 60; count > 0; count-- )) > do > if grep -q ACTIVE $IBDEVICE/state > then > echo ACTIVE > break > fi > echo -n "." > sleep 5 > done > if (( count <= 0 )) > then > echo DOWN - $0 timed out > fi > fi > ;; > stop|restart|reload|force-reload|condrestart|try-restart) > ;; > status) > if [[ -n $IBDEVICE && -f $IBDEVICE/state ]] > then > echo "$IBDEVICE is $(< $IBDEVICE/state) $(< > $IBDEVICE/rate)" > else > echo "No IBDEVICE found" > fi > ;; > *) > echo "Usage: ibready {start|stop|status|restart| > reload|force-reload|condrestart|try-restart}" > exit 2 > esac > exit ${RETVAL} > ———— > > -- ddj > Dave Johnson > > On Mar 8, 2018, at 6:10 AM, Caubet Serrabou Marc (PSI) < > *marc.cau...@psi.ch* <marc.cau...@psi.ch>> wrote: > > Hi all, > > with autoload = yes we do not ensure that GPFS will be started after the > IB link becomes up. Is there a way to force GPFS waiting to start until IB > ports are up? This can be probably done by adding something like > After=network-online.target and Wants=network-online.target in the systemd > file but I would like to know if this is natively possible from the GPFS > configuration. > > Thanks a lot, > Marc > _________________________________________ > Paul Scherrer Institut > High Performance Computing > Marc Caubet Serrabou > WHGA/036 > 5232 Villigen PSI > Switzerland > > Telephone: *+41 56 310 46 67* <+41%2056%20310%2046%2067> > E-Mail: *marc.cau...@psi.ch* <marc.cau...@psi.ch> > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > <https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=u-EMob09-dkE6jZbD3dTjBi3vWhmDXtxiOK3nqFyIgY&s=Cn4NIxkWXmTOrwjnMFpO8KxH1BvuZLdC5_C9fwPSQCg&e=> > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > <https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=u-EMob09-dkE6jZbD3dTjBi3vWhmDXtxiOK3nqFyIgY&s=JCfJgq6pZnKUI6d-rIgJXVcdZh7vmA5ypB1_goP_FFA&e=> > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* <http://spectrumscale.org/> > > *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=u-EMob09-dkE6jZbD3dTjBi3vWhmDXtxiOK3nqFyIgY&s=JCfJgq6pZnKUI6d-rIgJXVcdZh7vmA5ypB1_goP_FFA&e=* > <https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=u-EMob09-dkE6jZbD3dTjBi3vWhmDXtxiOK3nqFyIgY&s=JCfJgq6pZnKUI6d-rIgJXVcdZh7vmA5ypB1_goP_FFA&e=> > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* <http://spectrumscale.org/> > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > <http://gpfsug.org/mailman/listinfo/gpfsug-discuss> > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > >
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss