Hi,
On 10/01/18 20:42, Bob Peterson wrote:
This patch adds logic to function gfs2_rgrp_congested to check for
intra-node congestion by calling new function other_rgrp_users
which checks whether there are other glock holders (done only before
we lock it ourselves) and whether there are other block reservations.
If there is a block reservation, we know it is for a different
inode: if it was for ours, we would have been forced to use it and
we would not be searching for a new one.
Signed-off-by: Bob Peterson <[email protected]>
This is the wrong solution I think. We should not be choosing other
rgrps for allocation due to local contention. That is just likely to
land up creating more fragmentation by spreading things out across the
disk. The issue is that we can't currently separate the locking into dlm
(exclusive) and local (shared) but important data structures spinlock
protected, so that we land up with lock contention on the rgrp and
everything serialized via the glock. At the moment the glock state
machine only supports everything (dlm & local) shared or everything
exclusive.
If we want to resolve this issue, then we need to deal with that issue
first and allow multiple local users of the same rgrp to co-exist with
minimal lock contention. That requires the glock state machine to
support that particular lock mode first though,
Steve.
---
fs/gfs2/rgrp.c | 43 ++++++++++++++++++++++++++++++++++++++++---
1 file changed, 40 insertions(+), 3 deletions(-)
diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index 641bb4a8cf5b..6db1bf4816ff 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -1943,10 +1943,37 @@ static inline bool fast_to_acquire(const struct
gfs2_rgrpd *rgd)
return true;
}
+/**
+ * other_rgrp_users - figure out if this rgrp has other users
+ * @rgd: The resource group
+ * @locked: true if we've already held the glock
+ *
+ * We're trying to figure out if the given rgrp has anybody competing for
+ * its free space. If other processes have enqueued its glock, there's a
+ * good chance of competition.
+ *
+ * If there are multi-block reservations for this rgrp, there's a good
+ * chance another process will lock the rgrp for block allocations soon.
+ *
+ * If we've already held the glock, we no longer care if there are holders
+ * because that's now a given (rgrp glocks are never shared).
+ */
+static inline bool other_rgrp_users(const struct gfs2_rgrpd *rgd, bool locked)
+{
+ struct gfs2_glock *gl = rgd->rd_gl;
+
+ if (!locked && !list_empty(&gl->gl_holders))
+ return true;
+ if (!RB_EMPTY_ROOT(&rgd->rd_rstree))
+ return true;
+ return false;
+}
+
/**
* gfs2_rgrp_congested - decide whether a rgrp glock is congested
* @rs: The reservation in question
* @loops: An indication of how picky we can be (0=very, 1=less so)
+ * @locked: Indicates if checks are before or after we've enqueued the glock.
*
* There are two kinds of congestion: inter-node and intra-node.
*
@@ -1960,6 +1987,11 @@ static inline bool fast_to_acquire(const struct
gfs2_rgrpd *rgd)
* with doing so, but each process needs to wait for the other to release the
* rgrp glock before it may proceed.
*
+ * Both kinds of congestion can hurt performance, but it's faster to check
+ * intra-node, so we do that first. After all, why bother to check if we can
+ * get the glock quickly from DLM if other processes have also used that
+ * same reasoning.
+ *
* If we're not using inter-node locking (dlm) it doesn't make sense to check
* the glock statistics. Instead, we do some simple checks based on how
* desperate we are to get blocks (the number of loops).
@@ -1969,7 +2001,8 @@ static inline bool fast_to_acquire(const struct
gfs2_rgrpd *rgd)
* a block reservation. On second loop, call it congested if it's not fast to
* acquire.
*/
-static bool gfs2_rgrp_congested(const struct gfs2_blkreserv *rs, int loops)
+static bool gfs2_rgrp_congested(const struct gfs2_blkreserv *rs, int loops,
+ bool locked)
{
const struct gfs2_rgrpd *rgd = rs->rs_rbm.rgd;
@@ -1982,6 +2015,10 @@ static bool gfs2_rgrp_congested(const struct gfs2_blkreserv *rs, int loops)
if (loops >= 2)
return false;
+ /* Check for intra-node congestion */
+ if (loops == 0 && other_rgrp_users(rgd, locked))
+ return true;
+
if (loops == 0 && !fast_to_acquire(rgd))
return true;
@@ -2065,14 +2102,14 @@ int gfs2_inplace_reserve(struct gfs2_inode *ip, struct gfs2_alloc_parms *ap)
rg_locked = 0;
if (skip && skip--)
goto next_rgrp;
- if (gfs2_rgrp_congested(rs, loops))
+ if (gfs2_rgrp_congested(rs, loops, false))
goto next_rgrp;
error = gfs2_glock_nq_init(rs->rs_rbm.rgd->rd_gl,
LM_ST_EXCLUSIVE, flags,
&rs->rs_rgd_gh);
if (unlikely(error))
return error;
- if (gfs2_rgrp_congested(rs, loops))
+ if (gfs2_rgrp_congested(rs, loops, true))
goto skip_rgrp;
if (sdp->sd_args.ar_rgrplvb) {
error = update_rgrp_lvb(rs->rs_rbm.rgd);