Hi Malahal: Just to clarify, are you saying that on your VM pidof is missing? Or that it is there and not working as it did prior to RHEL/CentOS 8? pidof is returning pid numbers on my system. I've been looking at the mmnfsmonitor script and trying to see where the check for nfsd might be failing, but I've not been able to figure it out yet.
Thanks, Bryan --- Bryan Hill Lead System Administrator UCSD Physics Computing Facility 9500 Gilman Dr. # 0319 La Jolla, CA 92093 +1-858-534-5538 [email protected] On Sat, Feb 15, 2020 at 2:03 AM Malahal R Naineni <[email protected]> wrote: > I am not familiar with CNFS but looking at git source seems to indicate > that it uses 'pidof' to check if a program is running or not. "pidof nfsd" > works on RHEL7.x but it fails on my centos8.1 I just created. So either we > need to make sure pidof works on kernel threads or fix CNFS scripts. > > Regards, Malahal. > > > ----- Original message ----- > From: Bryan Hill <[email protected]> > Sent by: [email protected] > To: [email protected] > Cc: > Subject: [EXTERNAL] [gpfsug-discuss] CNFS issue after upgrading from > 4.2.3.11 to 5.0.4.2 > Date: Fri, Feb 14, 2020 11:40 PM > > Hi All: > > I'm performing a rolling upgrade of one of our GPFS clusters. This > particular cluster has 2 CNFS servers for some of our NFS clients. I wiped > one of the nodes and installed RHEL 8.1 and GPFS 5.0.4.2. The filesystem > mounts fine on the node when I disable CNFS on the node, but with it > enabled it's a no go. It appears mmnfsmonitor doesn't recognize that nfsd > has started, so it assumes the worst and shuts down the file system (I > currently have reboot on failure disabled to debug this). The thing is, it > actually does start nfsd processes when running mmstartup on the node. > Doing a "ps" shows 32 nfsd threads are running. > > Below is the CNFS-specific output from an attempt to start the node: > > CNFS[27243]: Restarting lockd to start grace > CNFS[27588]: Enabling 172.16.69.76 > CNFS[27694]: Restarting lockd to start grace > CNFS[27699]: Starting NFS services > CNFS[27764]: NFS clients of node 172.16.69.122 notified to reclaim NLM > locks > CNFS[27910]: Monitor has started pid=27787 > CNFS[28702]: Monitor detected nfsd was not running, will attempt to start > it > CNFS[28705]: Starting NFS services > CNFS[28730]: NFS clients of node 172.16.69.122 notified to reclaim NLM > locks > CNFS[28755]: Monitor detected nfsd was not running, will attempt to start > it > CNFS[28758]: Starting NFS services > CNFS[28789]: NFS clients of node 172.16.69.122 notified to reclaim NLM > locks > CNFS[28813]: Monitor detected nfsd was not running, will attempt to start > it > CNFS[28816]: Starting NFS services > CNFS[28844]: NFS clients of node 172.16.69.122 notified to reclaim NLM > locks > CNFS[28867]: Monitor detected nfsd was not running, will attempt to start > it > CNFS[28874]: Monitoring detected NFSD is inactive. mmnfsmonitor: NFS > server is not running or responding. Node failure initiated as configured. > CNFS[28924]: Unexporting all GPFS filesystems > > Any thoughts? My other CNFS node is handling everything for the time > being, thankfully! > > Thanks, > Bryan > > --- > Bryan Hill > Lead System Administrator > UCSD Physics Computing Facility > > 9500 Gilman Dr. # 0319 > La Jolla, CA 92093 > +1-858-534-5538 > [email protected] > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss >
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
