Hi: Just a note on this: the pidof fix was accepted upstream but has not made its way into rhel 8.2 yet
Thanks, Bryan --- Bryan Hill Lead System Administrator UCSD Physics Computing Facility 9500 Gilman Dr. # 0319 La Jolla, CA 92093 +1-858-534-5538 [email protected] On Mon, Feb 17, 2020 at 12:02 AM Malahal R Naineni <[email protected]> wrote: > > I filed a defect here, let us see what Redhat says. Yes, it doesn't work for > any kernel threads. It doesn't work for user level threads/processes. > > https://bugzilla.redhat.com/show_bug.cgi?id=1803640 > > Regards, Malahal. > > > ----- Original message ----- > From: Bryan Hill <[email protected]> > Sent by: [email protected] > To: gpfsug main discussion list <[email protected]> > Cc: > Subject: [EXTERNAL] Re: [gpfsug-discuss] CNFS issue after upgrading from > 4.2.3.11 to 5.0.4.2 > Date: Mon, Feb 17, 2020 8:26 AM > > Ah wait, I see what you might mean. pidof works but not specifically for > processes like nfsd. That is odd. > > Thanks, > Bryan > > > > On Sun, Feb 16, 2020 at 10:19 AM Bryan Hill <[email protected]> wrote: > > Hi Malahal: > > Just to clarify, are you saying that on your VM pidof is missing? Or that > it is there and not working as it did prior to RHEL/CentOS 8? pidof is > returning pid numbers on my system. I've been looking at the mmnfsmonitor > script and trying to see where the check for nfsd might be failing, but I've > not been able to figure it out yet. > > > > Thanks, > Bryan > > --- > Bryan Hill > Lead System Administrator > UCSD Physics Computing Facility > > 9500 Gilman Dr. # 0319 > La Jolla, CA 92093 > +1-858-534-5538 > [email protected] > > On Sat, Feb 15, 2020 at 2:03 AM Malahal R Naineni <[email protected]> wrote: > > I am not familiar with CNFS but looking at git source seems to indicate that > it uses 'pidof' to check if a program is running or not. "pidof nfsd" works > on RHEL7.x but it fails on my centos8.1 I just created. So either we need to > make sure pidof works on kernel threads or fix CNFS scripts. > > Regards, Malahal. > > > ----- Original message ----- > From: Bryan Hill <[email protected]> > Sent by: [email protected] > To: [email protected] > Cc: > Subject: [EXTERNAL] [gpfsug-discuss] CNFS issue after upgrading from 4.2.3.11 > to 5.0.4.2 > Date: Fri, Feb 14, 2020 11:40 PM > > Hi All: > > I'm performing a rolling upgrade of one of our GPFS clusters. This > particular cluster has 2 CNFS servers for some of our NFS clients. I wiped > one of the nodes and installed RHEL 8.1 and GPFS 5.0.4.2. The filesystem > mounts fine on the node when I disable CNFS on the node, but with it enabled > it's a no go. It appears mmnfsmonitor doesn't recognize that nfsd has > started, so it assumes the worst and shuts down the file system (I currently > have reboot on failure disabled to debug this). The thing is, it actually > does start nfsd processes when running mmstartup on the node. Doing a "ps" > shows 32 nfsd threads are running. > > Below is the CNFS-specific output from an attempt to start the node: > > CNFS[27243]: Restarting lockd to start grace > CNFS[27588]: Enabling 172.16.69.76 > CNFS[27694]: Restarting lockd to start grace > CNFS[27699]: Starting NFS services > CNFS[27764]: NFS clients of node 172.16.69.122 notified to reclaim NLM locks > CNFS[27910]: Monitor has started pid=27787 > CNFS[28702]: Monitor detected nfsd was not running, will attempt to start it > CNFS[28705]: Starting NFS services > CNFS[28730]: NFS clients of node 172.16.69.122 notified to reclaim NLM locks > CNFS[28755]: Monitor detected nfsd was not running, will attempt to start it > CNFS[28758]: Starting NFS services > CNFS[28789]: NFS clients of node 172.16.69.122 notified to reclaim NLM locks > CNFS[28813]: Monitor detected nfsd was not running, will attempt to start it > CNFS[28816]: Starting NFS services > CNFS[28844]: NFS clients of node 172.16.69.122 notified to reclaim NLM locks > CNFS[28867]: Monitor detected nfsd was not running, will attempt to start it > CNFS[28874]: Monitoring detected NFSD is inactive. mmnfsmonitor: NFS server > is not running or responding. Node failure initiated as configured. > CNFS[28924]: Unexporting all GPFS filesystems > > Any thoughts? My other CNFS node is handling everything for the time being, > thankfully! > > Thanks, > Bryan > > --- > Bryan Hill > Lead System Administrator > UCSD Physics Computing Facility > > 9500 Gilman Dr. # 0319 > La Jolla, CA 92093 > +1-858-534-5538 > [email protected] > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
