Alternative solution we're trying... Create the file /etc/systemd/system/gpfs.service.d/delay.conf containing:
[Service] ExecStartPre=/bin/sleep 60 Then I expect we should have long enough delay for infiniband to start before starting gpfs.. -jf On Fri, Mar 16, 2018 at 1:05 PM, Frederick Stock <sto...@us.ibm.com> wrote: > I have my doubts that mmdiag can be used in this script. In general the > guidance is to avoid or be very careful with mm* commands in a callback due > to the potential for deadlock. > > Fred > __________________________________________________ > Fred Stock | IBM Pittsburgh Lab | 720-430-8821 > sto...@us.ibm.com > > > > From: Jan-Frode Myklebust <janfr...@tanso.net> > To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> > Date: 03/16/2018 04:30 AM > > Subject: Re: [gpfsug-discuss] GPFS autoload - wait for IB ports > tobecomeactive > Sent by: gpfsug-discuss-boun...@spectrumscale.org > ------------------------------ > > > > Thanks Olaf, but we don't use NetworkManager on this cluster.. > > I now created this simple script: > > > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------- > #! /bin/bash - > # > # Fail mmstartup if not all configured IB ports are active. > # > # Install with: > # > # mmaddcallback fail-if-ibfail --command /var/mmfs/etc/fail-if-ibfail > --event preStartup --sync --onerror shutdown > # > > for port in $(/usr/lpp/mmfs/bin/mmdiag --config|grep verbsPorts | cut -f > 4- -d " ") > do > grep -q ACTIVE /sys/class/infiniband/${port%/*}/ports/${port##*/}/state > || exit 1 > done > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------- > > which I haven't tested, but assume should work. Suggestions for > improvements would be much appreciated! > > > > -jf > > > On Thu, Mar 15, 2018 at 6:30 PM, Olaf Weiser <*olaf.wei...@de.ibm.com* > <olaf.wei...@de.ibm.com>> wrote: > > you can try : > systemctl enable NetworkManager-wait-online > ln -s '/usr/lib/systemd/system/NetworkManager-wait-online.service' > '/etc/systemd/system/multi-user.target.wants/NetworkManager-wait-online. > service' > > in many cases .. it helps .. > > > > > > From: Jan-Frode Myklebust <*janfr...@tanso.net* > <janfr...@tanso.net>> > To: gpfsug main discussion list <*gpfsug-discuss@spectrumscale.org* > <gpfsug-discuss@spectrumscale.org>> > Date: 03/15/2018 06:18 PM > Subject: Re: [gpfsug-discuss] GPFS autoload - wait for IB ports to > becomeactive > Sent by: *gpfsug-discuss-boun...@spectrumscale.org* > <gpfsug-discuss-boun...@spectrumscale.org> > ------------------------------ > > > > I found some discussion on this at > *https://www.ibm.com/developerworks/community/forums/html/threadTopic?id=77777777-0000-0000-0000-000014471957&ps=25* > <https://www.ibm.com/developerworks/community/forums/html/threadTopic?id=77777777-0000-0000-0000-000014471957&ps=25>and > there it's claimed that none of the callback events are early enough to > resolve this. That we need a pre-preStartup trigger. Any idea if this has > changed -- or is the callback option then only to do a "--onerror > shutdown" if it has failed to connect IB ? > > > On Thu, Mar 8, 2018 at 1:42 PM, Frederick Stock <*sto...@us.ibm.com* > <sto...@us.ibm.com>> wrote: > You could also use the GPFS prestartup callback (mmaddcallback) to execute > a script synchronously that waits for the IB ports to become available > before returning and allowing GPFS to continue. Not systemd integrated but > it should work. > > Fred > __________________________________________________ > Fred Stock | IBM Pittsburgh Lab | *720-430-8821* <(720)%20430-8821> > *sto...@us.ibm.com* <sto...@us.ibm.com> > > > > From: *david_john...@brown.edu* <david_john...@brown.edu> > To: gpfsug main discussion list <*gpfsug-discuss@spectrumscale.org* > <gpfsug-discuss@spectrumscale.org>> > Date: 03/08/2018 07:34 AM > Subject: Re: [gpfsug-discuss] GPFS autoload - wait for IB ports to > become active > Sent by: *gpfsug-discuss-boun...@spectrumscale.org* > <gpfsug-discuss-boun...@spectrumscale.org> > ------------------------------ > > > > > Until IBM provides a solution, here is my workaround. Add it so it runs > before the gpfs script, I call it from our custom xcat diskless boot > scripts. Based on rhel7, not fully systemd integrated. YMMV! > > Regards, > — ddj > ——- > [ddj@storage041 ~]$ cat /etc/init.d/ibready > #! /bin/bash > # > # chkconfig: 2345 06 94 > # /etc/rc.d/init.d/ibready > # written in 2016 David D Johnson (ddj <at> *brown.edu* > <https://urldefense.proofpoint.com/v2/url?u=http-3A__brown.edu&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=u-EMob09-dkE6jZbD3dTjBi3vWhmDXtxiOK3nqFyIgY&s=DZ8S9rlTWQ8XfqHR6o5CWfRorBROzg9akyebO0kFd0M&e=> > ) > # > ### BEGIN INIT INFO > # Provides: ibready > # Required-Start: > # Required-Stop: > # Default-Stop: > # Description: Block until infiniband is ready > # Short-Description: Block until infiniband is ready > ### END INIT INFO > > RETVAL=0 > if [[ -d /sys/class/infiniband ]] > then > IBDEVICE=$(dirname $(grep -il infiniband > /sys/class/infiniband/*/ports/1/link* | head -n 1)) > fi > # See how we were called. > case "$1" in > start) > if [[ -n $IBDEVICE && -f $IBDEVICE/state ]] > then > echo -n "Polling for InfiniBand link up: " > for (( count = 60; count > 0; count-- )) > do > if grep -q ACTIVE $IBDEVICE/state > then > echo ACTIVE > break > fi > echo -n "." > sleep 5 > done > if (( count <= 0 )) > then > echo DOWN - $0 timed out > fi > fi > ;; > stop|restart|reload|force-reload|condrestart|try-restart) > ;; > status) > if [[ -n $IBDEVICE && -f $IBDEVICE/state ]] > then > echo "$IBDEVICE is $(< $IBDEVICE/state) $(< > $IBDEVICE/rate)" > else > echo "No IBDEVICE found" > fi > ;; > *) > echo "Usage: ibready {start|stop|status|restart| > reload|force-reload|condrestart|try-restart}" > exit 2 > esac > exit ${RETVAL} > ———— > > -- ddj > Dave Johnson > > On Mar 8, 2018, at 6:10 AM, Caubet Serrabou Marc (PSI) < > *marc.cau...@psi.ch* <marc.cau...@psi.ch>> wrote: > > Hi all, > > with autoload = yes we do not ensure that GPFS will be started after the > IB link becomes up. Is there a way to force GPFS waiting to start until IB > ports are up? This can be probably done by adding something like > After=network-online.target and Wants=network-online.target in the systemd > file but I would like to know if this is natively possible from the GPFS > configuration. > > Thanks a lot, > Marc > _________________________________________ > Paul Scherrer Institut > High Performance Computing > Marc Caubet Serrabou > WHGA/036 > 5232 Villigen PSI > Switzerland > > Telephone: *+41 56 310 46 67* <+41%2056%20310%2046%2067> > E-Mail: *marc.cau...@psi.ch* <marc.cau...@psi.ch> > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > <https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=u-EMob09-dkE6jZbD3dTjBi3vWhmDXtxiOK3nqFyIgY&s=Cn4NIxkWXmTOrwjnMFpO8KxH1BvuZLdC5_C9fwPSQCg&e=> > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > <https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=u-EMob09-dkE6jZbD3dTjBi3vWhmDXtxiOK3nqFyIgY&s=JCfJgq6pZnKUI6d-rIgJXVcdZh7vmA5ypB1_goP_FFA&e=> > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > <https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org_&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=xImYTxt4pm1o5znVn5Vdoka2uxgsTRpmlCGdEWhB9vw&s=j35YX5vYr7_YZ5e8mzqvyCel2GUSQqjP2s7dBECkOQw&e=> > > *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=u-EMob09-dkE6jZbD3dTjBi3vWhmDXtxiOK3nqFyIgY&s=JCfJgq6pZnKUI6d-rIgJXVcdZh7vmA5ypB1_goP_FFA&e=* > <https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=u-EMob09-dkE6jZbD3dTjBi3vWhmDXtxiOK3nqFyIgY&s=JCfJgq6pZnKUI6d-rIgJXVcdZh7vmA5ypB1_goP_FFA&e=> > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > <https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org_&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=xImYTxt4pm1o5znVn5Vdoka2uxgsTRpmlCGdEWhB9vw&s=j35YX5vYr7_YZ5e8mzqvyCel2GUSQqjP2s7dBECkOQw&e=> > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > <https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=xImYTxt4pm1o5znVn5Vdoka2uxgsTRpmlCGdEWhB9vw&s=veOZZz80aBzoCTKusx6WOpVlYs64eNkp5pM9kbHgvic&e=> > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > <https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=xImYTxt4pm1o5znVn5Vdoka2uxgsTRpmlCGdEWhB9vw&s=ocL9CBYdvYLa3eMuhGzZkyyDKzVCWSbQGeSj7t-OYTA&e=> > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > <https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=xImYTxt4pm1o5znVn5Vdoka2uxgsTRpmlCGdEWhB9vw&s=veOZZz80aBzoCTKusx6WOpVlYs64eNkp5pM9kbHgvic&e=> > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > <https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=xImYTxt4pm1o5znVn5Vdoka2uxgsTRpmlCGdEWhB9vw&s=ocL9CBYdvYLa3eMuhGzZkyyDKzVCWSbQGeSj7t-OYTA&e=> > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > <https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=xImYTxt4pm1o5znVn5Vdoka2uxgsTRpmlCGdEWhB9vw&s=veOZZz80aBzoCTKusx6WOpVlYs64eNkp5pM9kbHgvic&e=> > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug. > org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_ > iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m= > xImYTxt4pm1o5znVn5Vdoka2uxgsTRpmlCGdEWhB9vw&s= > veOZZz80aBzoCTKusx6WOpVlYs64eNkp5pM9kbHgvic&e= > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > >
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss