This is a second go at my "Radical Reform of glock state machine" patch set. It's been rebased to the latest git tree, and had a lot more testing (from the actual for-next tree rather than testing the ports). This includes performance testing using iozone.
The patches are unchanged from my original post on 09 July, except that I've added one patch at the beginning, "Remove gotos from function run_queue." Just to reiterate: The goal here is to preserve the current logic of today's glock state machine, so no attempts have been made (yet) to make it faster or more sane. The goal is to eventually allow gfs2 to reduce its use of the glock workqueue by calling the state machine directly from the locking process, thus reducing glock latency and greatly reducing context switches. That will be another set of patches. I'd like to get this pushed early in the cycle so it doesn't keep getting pushed out for various merge window cycles. Bob Peterson --- GFS2 glocks go through a very complex and confusing series of steps to transition from one locking mode to another. For example, to transition from unlocked (UN) mode to Shared (SH) mode we need to "promote" the lock, and to do so requires a series of steps such as asking permission of DLM, fetching dlm responses, checking for remote demotion requests, granting multiple holders, and so forth. This is all managed by a large set of functions known as the glock state machine. Today, the glock state machine is a disorganized chaos of functions that are neither documented nor intuitive. It's a complete disaster that makes no sense at all to someone coming in to the code cold. For proof, you need only look at the fact that function do_xmote() can call function finish_xmote(), but finish_xmote() can call do_xmote() as well. It works really well, but the problem is: it's a house of cards. It's easy to misunderstand the intent and get it wrong, and if you try to make any little change, everything falls apart. The other problem is that today's glock state machine relies completely on the delayed work queue. That means you can't simply transition from one state to the next; you need to queue delayed work and have the delayed workqueue do the work. This requires an absurd number of context switches to make the simplest change to a glock. That creates a lot of unnecessary latency, and makes it much harder for smaller environments, like virts, to do the job, due to a potentially very limited number of cores. This patch set is my first pass designed to radically reform the gfs2 glock state machine. Each patch slowly transforms it into more and more of a real state machine. The first few patches merely just untangle the mess. After that, each patch removes an old-style state and adds it as a new state for the new state machine. The result is a state machine that's readable, and each state is now clearly labeled and more obvious what it's doing and what other state it can transition to. One primary goal here is to leave the logic completely unchanged. I wanted to make it so that code reviewers could check to make sure I did not break today's existing logic. I do, however, have a few optimizations at the end. So I'm not trying to fix any bugs here. I'm just untangling spaghetti. Another goal was to make each patch as small and digestable as possible to avoid any hand waving. I'm planning future patches to reduce the context switches by making the state machine not rely as much on the glock work queue. These patches will speed up glocks by reducing the work queue delays, by executing the state machine from within the process context. For example, there are only a few cases in which we need to ensure a glock demote happen later (minimum hold time). --- Bob Peterson (13): GFS2: Remove gotos from function run_queue GFS2: Make do_xmote determine its own gh parameter GFS2: Eliminate a goto in finish_xmote GFS2: Baby step toward a real state machine: finish_xmote GFS2: Add do_xmote states to state machine GFS2: Make do_xmote not call the state machine again GFS2: Add blocking and non-blocking demote to state machine GFS2: Add a new GL_ST_PROMOTE state to glock state machine GFS2: Replace run_queue with new GL_ST_RUN state in state machine GFS2: Reduce redundancy in GL_ST_DEMOTE_NONBLOCK state GFS2: Reduce glock_work_func to a single call to state_machine GFS2: Add new GL_ST_UNLOCK state to reduce calls to the __ version GFS2: Optimization of GL_ST_UNLOCK state fs/gfs2/glock.c | 327 +++++++++++++++++++++++++++++------------------ fs/gfs2/glock.h | 14 ++ fs/gfs2/incore.h | 1 + 3 files changed, 221 insertions(+), 121 deletions(-) -- 2.19.1