Would it help to lower the grace time? mmnfs configuration change LEASE_LIFETIME=10 mmnfs configuration change GRACE_PERIOD=10
-jf ons. 26. apr. 2017 kl. 16.20 skrev Simon Thompson (IT Research Support) < [email protected]>: > Nope, the clients are all L3 connected, so not an arp issue. > > Two things we have observed: > > 1. It triggers when one of the CES IPs moves and quickly moves back again. > The move occurs because the NFS server goes into grace: > > 2017-04-25 20:36:49 : epoch 00040183 : <NODENAME> : > ganesha.nfsd-1261[dbus] nfs4_start_grace :STATE :EVENT :NFS Server Now IN > GRACE, duration 60 > 2017-04-25 20:36:49 : epoch 00040183 : <NODENAME> : > ganesha.nfsd-1261[dbus] nfs4_start_grace :STATE :EVENT :NFS Server > recovery event 2 nodeid -1 ip <CESIP> > 2017-04-25 20:36:49 : epoch 00040183 : <NODENAME> : > ganesha.nfsd-1261[dbus] nfs_release_v4_client :STATE :EVENT :NFS Server V4 > recovery release ip <CESIP> > 2017-04-25 20:36:49 : epoch 00040183 : <NODENAME> : > ganesha.nfsd-1261[dbus] nfs_in_grace :STATE :EVENT :NFS Server Now IN GRACE > 2017-04-25 20:37:42 : epoch 00040183 : <NODENAME> : > ganesha.nfsd-1261[dbus] nfs4_start_grace :STATE :EVENT :NFS Server Now IN > GRACE, duration 60 > 2017-04-25 20:37:44 : epoch 00040183 : <NODENAME> : > ganesha.nfsd-1261[dbus] nfs4_start_grace :STATE :EVENT :NFS Server Now IN > GRACE, duration 60 > 2017-04-25 20:37:44 : epoch 00040183 : <NODENAME> : > ganesha.nfsd-1261[dbus] nfs4_start_grace :STATE :EVENT :NFS Server > recovery event 4 nodeid 2 ip > > > > We can't see in any of the logs WHY ganesha is going into grace. Any > suggestions on how to debug this further? (I.e. If we can stop the grace > issues, we can solve the problem mostly). > > > 2. Our clients are using LDAP which is bound to the CES IPs. If we > shutdown nslcd on the client we can get the client to recover once all the > TIME_WAIT connections have gone. Maybe this was a bad choice on our side > to bind to the CES IPs - we figured it would handily move the IPs for us, > but I guess the mmcesfuncs isn't aware of this and so doesn't kill the > connections to the IP as it goes away. > > > So two approaches we are going to try. Reconfigure the nslcd on a couple > of clients and see if they still show up the issues when fail-over occurs. > Second is to work out why the NFS servers are going into grace in the > first place. > > Simon > > On 26/04/2017, 00:46, "[email protected] on behalf > of [email protected]" <[email protected] on > behalf of [email protected]> wrote: > > >Are you using infiniband or Ethernet? I'm wondering if IBM have solved > >the gratuitous arp issue which we see with our non-protocols NFS > >implementation. > > > >-----Original Message----- > >From: [email protected] > >[mailto:[email protected]] On Behalf Of Simon > >Thompson (IT Research Support) > >Sent: Wednesday, 26 April 2017 3:31 AM > >To: gpfsug main discussion list <[email protected]> > >Subject: Re: [gpfsug-discuss] NFS issues > > > >I did some digging in the mmcesfuncs to see what happens server side on > >fail over. > > > >Basically the server losing the IP is supposed to terminate all sessions > >and the receiver server sends ACK tickles. > > > >My current supposition is that for whatever reason, the losing server > >isn't releasing something and the client still has hold of a connection > >which is mostly dead. The tickle then fails to the client from the new > >server. > > > >This would explain why failing the IP back to the original server usually > >brings the client back to life. > > > >This is only my working theory at the moment as we can't reliably > >reproduce this. Next time it happens we plan to grab some netstat from > >each side. > > > >Then we plan to issue "mmcmi tcpack $cesIpPort $clientIpPort" on the > >server that received the IP and see if that fixes it (i.e. the receiver > >server didn't tickle properly). (Usage extracted from mmcesfuncs which is > >ksh of course). ... CesIPPort is colon separated IP:portnumber (of NFSd) > >for anyone interested. > > > >Then try and kill he sessions on the losing server to check if there is > >stuff still open and re-tickle the client. > > > >If we can get steps to workaround, I'll log a PMR. I suppose I could do > >that now, but given its non deterministic and we want to be 100% sure > >it's not us doing something wrong, I'm inclined to wait until we do some > >more testing. > > > >I agree with the suggestion that it's probably IO pending nodes that are > >affected, but don't have any data to back that up yet. We did try with a > >read workload on a client, but may we need either long IO blocked reads > >or writes (from the GPFS end). > > > >We also originally had soft as the default option, but saw issues then > >and the docs suggested hard, so we switched and also enabled sync (we > >figured maybe it was NFS client with uncommited writes), but neither have > >resolved the issues entirely. Difficult for me to say if they improved > >the issue though given its sporadic. > > > >Appreciate people's suggestions! > > > >Thanks > > > >Simon > >________________________________________ > >From: [email protected] > >[[email protected]] on behalf of Jan-Frode > >Myklebust [[email protected]] > >Sent: 25 April 2017 18:04 > >To: gpfsug main discussion list > >Subject: Re: [gpfsug-discuss] NFS issues > > > >I *think* I've seen this, and that we then had open TCP connection from > >client to NFS server according to netstat, but these connections were not > >visible from netstat on NFS-server side. > > > >Unfortunately I don't remember what the fix was.. > > > > > > > > -jf > > > >tir. 25. apr. 2017 kl. 16.06 skrev Simon Thompson (IT Research Support) > ><[email protected]<mailto:[email protected]>>: > >Hi, > > > >From what I can see, Ganesha uses the Export_Id option in the config file > >(which is managed by CES) for this. I did find some reference in the > >Ganesha devs list that if its not set, then it would read the FSID from > >the GPFS file-system, either way they should surely be consistent across > >all the nodes. The posts I found were from someone with an IBM email > >address, so I guess someone in the IBM teams. > > > >I checked a couple of my protocol nodes and they use the same Export_Id > >consistently, though I guess that might not be the same as the FSID value. > > > >Perhaps someone from IBM could comment on if FSID is likely to the cause > >of my problems? > > > >Thanks > > > >Simon > > > >On 25/04/2017, 14:51, > >"[email protected]<mailto: > gpfsug-discuss-bounces@sp > >ectrumscale.org> on behalf of Ouwehand, JJ" > ><[email protected]<mailto: > gpfsug-discuss-bounces@sp > >ectrumscale.org> on behalf of > >[email protected]<mailto:[email protected]>> wrote: > > > >>Hello, > >> > >>At first a short introduction. My name is Jaap Jan Ouwehand, I work at > >>a Dutch hospital "VU Medical Center" in Amsterdam. We make daily use of > >>IBM Spectrum Scale, Spectrum Archive and Spectrum Protect in our > >>critical (office, research and clinical data) business process. We have > >>three large GPFS filesystems for different purposes. > >> > >>We also had such a situation with cNFS. A failover (IPtakeover) was > >>technically good, only clients experienced "stale filehandles". We > >>opened a PMR at IBM and after testing, deliver logs, tcpdumps and a few > >>months later, the solution appeared to be in the fsid option. > >> > >>An NFS filehandle is built by a combination of fsid and a hash function > >>on the inode. After a failover, the fsid value can be different and the > >>client has a "stale filehandle". To avoid this, the fsid value can be > >>statically specified. See: > >> > >> > https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.2/com.ibm.spectrum > >>. > >>scale.v4r22.doc/bl1adm_nfslin.htm > >> > >>Maybe there is also a value in Ganesha that changes after a failover. > >>Certainly since most sessions will be re-established after a failback. > >>Maybe you see more debug information with tcpdump. > >> > >> > >>Kind regards, > >> > >>Jaap Jan Ouwehand > >>ICT Specialist (Storage & Linux) > >>VUmc - ICT > >>E: [email protected]<mailto:[email protected]> > >>W: www.vumc.com<http://www.vumc.com> > >> > >> > >> > >>-----Oorspronkelijk bericht----- > >>Van: > >>[email protected]<mailto:gpfsug-discuss-bounces@ > >>spectrumscale.org> > >>[mailto:[email protected]<mailto:gpfsug-discuss- > >>[email protected]>] Namens Simon Thompson (IT Research Support) > >>Verzonden: dinsdag 25 april 2017 13:21 > >>Aan: > >>[email protected]<mailto:[email protected] > >>g> > >>Onderwerp: [gpfsug-discuss] NFS issues > >> > >>Hi, > >> > >>We have recently started deploying NFS in addition our existing SMB > >>exports on our protocol nodes. > >> > >>We use a RR DNS name that points to 4 VIPs for SMB services and > >>failover seems to work fine with SMB clients. We figured we could use > >>the same name and IPs and run Ganesha on the protocol servers, however > >>we are seeing issues with NFS clients when IP failover occurs. > >> > >>In normal operation on a client, we might see several mounts from > >>different IPs obviously due to the way the DNS RR is working, but it > >>all works fine. > >> > >>In a failover situation, the IP will move to another node and some > >>clients will carry on, others will hang IO to the mount points referred > >>to by the IP which has moved. We can *sometimes* trigger this by > >>manually suspending a CES node, but not always and some clients > >>mounting from the IP moving will be fine, others won't. > >> > >>If we resume a node an it fails back, the clients that are hanging will > >>usually recover fine. We can reboot a client prior to failback and it > >>will be fine, stopping and starting the ganesha service on a protocol > >>node will also sometimes resolve the issues. > >> > >>So, has anyone seen this sort of issue and any suggestions for how we > >>could either debug more or workaround? > >> > >>We are currently running the packages > >>nfs-ganesha-2.3.2-0.ibm32_1.el7.x86_64 (4.2.2-2 release ones). > >> > >>At one point we were seeing it a lot, and could track it back to an > >>underlying GPFS network issue that was causing protocol nodes to be > >>expelled occasionally, we resolved that and the issues became less > >>apparent, but maybe we just fixed one failure mode so see it less often. > >> > >>On the clients, we use -o sync,hard BTW as in the IBM docs. > >> > >>On a client showing the issues, we'll see in dmesg, NFS related > >>messages > >>like: > >>[Wed Apr 12 16:59:53 2017] nfs: server > >>MYNFSSERVER.bham.ac.uk<http://MYNFSSERVER.bham.ac.uk> not responding, > >>timed out > >> > >>Which explains the client hang on certain mount points. > >> > >>The symptoms feel very much like those logged in this Gluster/ganesha > >>bug: > >>https://bugzilla.redhat.com/show_bug.cgi?id=1354439 > >> > >> > >>Thanks > >> > >>Simon > >> > >>_______________________________________________ > >>gpfsug-discuss mailing list > >>gpfsug-discuss at spectrumscale.org<http://spectrumscale.org> > >>http://gpfsug.org/mailman/listinfo/gpfsug-discuss > >>_______________________________________________ > >>gpfsug-discuss mailing list > >>gpfsug-discuss at spectrumscale.org<http://spectrumscale.org> > >>http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > >_______________________________________________ > >gpfsug-discuss mailing list > >gpfsug-discuss at spectrumscale.org<http://spectrumscale.org> > >http://gpfsug.org/mailman/listinfo/gpfsug-discuss > >_______________________________________________ > >gpfsug-discuss mailing list > >gpfsug-discuss at spectrumscale.org > >http://gpfsug.org/mailman/listinfo/gpfsug-discuss > >_______________________________________________ > >gpfsug-discuss mailing list > >gpfsug-discuss at spectrumscale.org > >http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss >
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
