This is a second go at my "Radical Reform of glock state machine"
patch set. It's been rebased to the latest git tree, and had a lot more
testing (from the actual for-next tree rather than testing the ports).
This includes performance testing using iozone.

The patches are unchanged from my original post on 09 July, except that
I've added one patch at the beginning, "Remove gotos from function run_queue."

Just to reiterate: The goal here is to preserve the current logic of
today's glock state machine, so no attempts have been made (yet) to
make it faster or more sane. The goal is to eventually allow gfs2 to
reduce its use of the glock workqueue by calling the state machine
directly from the locking process, thus reducing glock latency and
greatly reducing context switches. That will be another set of patches.

I'd like to get this pushed early in the cycle so it doesn't keep
getting pushed out for various merge window cycles.

Bob Peterson
---
GFS2 glocks go through a very complex and confusing series of steps to
transition from one locking mode to another. For example, to transition
from unlocked (UN) mode to Shared (SH) mode we need to "promote" the lock,
and to do so requires a series of steps such as asking permission of DLM,
fetching dlm responses, checking for remote demotion requests, granting
multiple holders, and so forth.

This is all managed by a large set of functions known as the glock state
machine. Today, the glock state machine is a disorganized chaos of
functions that are neither documented nor intuitive. It's a complete
disaster that makes no sense at all to someone coming in to the code cold.
For proof, you need only look at the fact that function do_xmote() can
call function finish_xmote(), but finish_xmote() can call do_xmote() as
well. It works really well, but the problem is: it's a house of cards.
It's easy to misunderstand the intent and get it wrong, and if you try to
make any little change, everything falls apart.

The other problem is that today's glock state machine relies completely
on the delayed work queue. That means you can't simply transition from
one state to the next; you need to queue delayed work and have the
delayed workqueue do the work. This requires an absurd number of context
switches to make the simplest change to a glock. That creates a lot of
unnecessary latency, and makes it much harder for smaller environments,
like virts, to do the job, due to a potentially very limited number of
cores.

This patch set is my first pass designed to radically reform the
gfs2 glock state machine. Each patch slowly transforms it into more and
more of a real state machine.

The first few patches merely just untangle the mess. After that, each
patch removes an old-style state and adds it as a new state for the new
state machine.

The result is a state machine that's readable, and each state is now
clearly labeled and more obvious what it's doing and what other state
it can transition to.

One primary goal here is to leave the logic completely unchanged. I
wanted to make it so that code reviewers could check to make sure I
did not break today's existing logic. I do, however, have a few
optimizations at the end. So I'm not trying to fix any bugs here.
I'm just untangling spaghetti.

Another goal was to make each patch as small and digestable as possible
to avoid any hand waving.

I'm planning future patches to reduce the context switches by making
the state machine not rely as much on the glock work queue. These
patches will speed up glocks by reducing the work queue delays, by
executing the state machine from within the process context. For
example, there are only a few cases in which we need to ensure a
glock demote happen later (minimum hold time).
---
Bob Peterson (13):
  GFS2: Remove gotos from function run_queue
  GFS2: Make do_xmote determine its own gh parameter
  GFS2: Eliminate a goto in finish_xmote
  GFS2: Baby step toward a real state machine: finish_xmote
  GFS2: Add do_xmote states to state machine
  GFS2: Make do_xmote not call the state machine again
  GFS2: Add blocking and non-blocking demote to state machine
  GFS2: Add a new GL_ST_PROMOTE state to glock state machine
  GFS2: Replace run_queue with new GL_ST_RUN state in state machine
  GFS2: Reduce redundancy in GL_ST_DEMOTE_NONBLOCK state
  GFS2: Reduce glock_work_func to a single call to state_machine
  GFS2: Add new GL_ST_UNLOCK state to reduce calls to the __ version
  GFS2: Optimization of GL_ST_UNLOCK state

 fs/gfs2/glock.c  | 327 +++++++++++++++++++++++++++++------------------
 fs/gfs2/glock.h  |  14 ++
 fs/gfs2/incore.h |   1 +
 3 files changed, 221 insertions(+), 121 deletions(-)

-- 
2.19.1

Reply via email to