Just looking at the code I don't see where you retry the lock request from the FSAL which is required to add it back to the FSAL queue. Am I missing something? Marc.
From: "Frank Filz" <ffilz...@mindspring.com> To: Marc Eshel/Almaden/IBM@IBMUS Cc: "'nfs-ganesha-devel'" <nfs-ganesha-devel@lists.sourceforge.net> Date: 09/07/2016 03:57 PM Subject: RE: [Nfs-ganesha-devel] NLM async locking Marc, Could you try the top commit in this branch: https://github.com/ffilz/nfs-ganesha/commits/async It may not be the complete solution, but I think it will help your scenario. I need to do more work on async blocking locks... And it looks like without async blocking lock support, Ganesha doesn't handle the case where a lock blocks on a conflicting lock from outside the Ganesha instance. I will be looking at implementing my thread pool idea that I modeled in the multilock tool. Frank > -----Original Message----- > From: Frank Filz [mailto:ffilz...@mindspring.com] > Sent: Wednesday, September 7, 2016 9:42 AM > To: 'Marc Eshel' <es...@us.ibm.com> > Cc: 'nfs-ganesha-devel' <nfs-ganesha-devel@lists.sourceforge.net> > Subject: Re: [Nfs-ganesha-devel] NLM async locking > > Ok, I'm not sure this ever worked right... > > With the lock available upcall, we never put the lock back on the blocked lock > list if an attempt to acquire the lock from the FSAL fails... > > So the way the lock available upcall is supposed to work: > > Client requests conflicting lock > Blocked lock gets registered by FSAL > SAL puts lock on blocked lock list > Time passes > FSAL makes lock available upcall > SAL finds the blocked lock entry in the blocked lock list SAL makes a call to > FSAL to attempt to acquire the lock Assume that fails (in the example, > because multiple conflicting locks got > notified) > SAL puts the lock BACK on the blocked lock list (this step is missing) and all is > well.... > Time passes > FSAL makes lock available upcall > SAL finds the blocked lock entry in the blocked lock list SAL makes a call to > FSAL to attempt to acquire the lock Lock is granted by FSAL SAL makes async > call back to client If THAT fails, SAL releases the lock from the FSAL and > disposes of the lock entry and all is well If THAT succeeds, the lock is > completely granted and all is well > > I also see that if the client retries the lock before it is granted, we don't > remove the lock entry from the blocked lock list... I don't think that will ever > cause a problem but we should clean that up also... > > Let me try a patch to fix... > > Frank > > > -----Original Message----- > > From: Marc Eshel [mailto:es...@us.ibm.com] > > Sent: Tuesday, September 6, 2016 9:34 PM > > To: Frank Filz <ffilz...@mindspring.com> > > Cc: 'nfs-ganesha-devel' <nfs-ganesha-devel@lists.sourceforge.net> > > Subject: RE: NLM async locking > > > > Did you get a chance to look at this problem? > > Marc. > > > > > > > > From: "Frank Filz" <ffilz...@mindspring.com> > > To: Marc Eshel/Almaden/IBM@IBMUS > > Cc: "'nfs-ganesha-devel'" <nfs-ganesha-devel@lists.sourceforge.net> > > Date: 08/29/2016 02:37 PM > > Subject: RE: NLM async locking > > > > > > > > > I see the following failure: > > > 1. Get conflicting locks from 3 clients > > > cli 1 gets 0-100 > > > cli 2 is blocked on 0-1000 > > > cli 3 is blocked on 0-10000 > > > 2. cli 1 unlocks > > > up-call for cli 2 and 3 to retry > > > cli 2 gets 0-1000 > > > cli 3 is blocked on 0-1000 > > > 3. cli 2 unlocks > > > up-call for cli 3 but Ganesha fails > > > > > > /* We must be out of sync with FSAL, this is fatal */ > > > LogLockDesc(COMPONENT_STATE, NIV_MAJ, "Blocked Lock Not > > > Found for" > > > , > > > obj, owner, lock); > > > LogFatal(COMPONENT_STATE, "Locks out of sync with FSAL"); > > > > > > I think the problem is in step 2, after cli 3 failed for the second > > > time > > it is not > > > put back in queue, the sbd_list. > > > > > > Can you please confirm this logic is very complicated. > > > > That sounds like a likely problem. I'd have to dig into the code to > > see > why... > > May take me a day or two to investigate. > > > > Frank > > > > > > > > --- > > This email has been checked for viruses by Avast antivirus software. > > https://www.avast.com/antivirus > > > > > > > > > > --- > This email has been checked for viruses by Avast antivirus software. > https://www.avast.com/antivirus > > > ---------------------------------------------------------------------------- -- > _______________________________________________ > Nfs-ganesha-devel mailing list > Nfs-ganesha-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus ------------------------------------------------------------------------------ _______________________________________________ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel