On 9/22/21 6:57 AM, Andreas Gruenbacher wrote:
On Thu, Sep 16, 2021 at 9:11 PM Bob Peterson <rpete...@redhat.com> wrote:
Before this patch, when a glock was locked, the very first holder on the
queue would unlock the lockref and call the go_lock glops function (if
one exists), unless GL_SKIP was specified. When we introduced the new
node-scope concept, we allowed multiple holders to lock glocks in EX mode
and share the lock, but node-scope introduced a new problem: if the
first holder has GL_SKIP and the next one does NOT, since it is not the
first holder on the queue, the go_lock op was not called.

We use go_lock to (re)validate inodes (for inode glocks) and to read
in bitmaps (for resource group glocks). I can see how calling go_lock
was originally tied to the first lock holder, but GL_SKIP already
broke the simple model that the first holder will call go_lock. The
go_lock_needed callback only makes things worse yet again,
unfortunately.

In what way does go_lock_needed make things worse?

How about we introduce a new GLF_REVALIDATE flag that indicates that
go_lock needs to be called? The flag would be set when instantiating a
new glock and when dequeuing the last holder, and cleared in go_lock
(and in gfs2_inode_refresh for GL_SKIP holders). I'm not sure if

That was my original design, and it makes the most sense. I named the flag GLF_GO_LOCK_SKIPPED, but essentially the same thing. Unfortunately, I ran into all kinds of problems implementing it. In those patches, first holders would either call glops->go_lock() or set GLF_GO_LOCK_SKIPPED. Once the go_lock function was complete, it cleared GLF_GO_LOCK_SKIPPED, and called wake_up_bit. Secondary holders did wait_on_bit and waited for the other process's go_lock to complete.

But I had tons of problems getting this to work properly. Processes would hang and deadlock for seemingly no reason. Finally I got frustrated and sought other solutions.

I'm willing to try to resurrect that patch set and try again. Maybe you can help me figure out what I'm doing wrong and why it's not working.

Bob Peterson

GLF_REVALIDATE can fully replace GIF_INVALID as well, but it looks
like it at first glance.

Thanks,
Andreas

Reply via email to