Hi,

On 02/11/17 17:06, Bob Peterson wrote:
Addressing Steve's other questions:

----- Original Message -----
| > We can predict whether a rgrp glock is the victim of intranode
| > congestion based on set of rules. Rules Enforced by this patch:
| >
| > 1. If the current process has a multi-block reservation in the
| >     rgrp, it needs to use it regardless of the congestion. The
| >     congestion algorithm should have prevented the reservation
| >     in the first place.
| > 2. If some process has the rgrp gl_lockref spin_lock locked,
| >     they are preparing to use it for a reservation. So we take
| >     this as a clear sign of impending contention.
| There can be all kinds of reasons why this lock has been taken. It
| should have very low contention on it, so that I don't think this is
| useful overall.

I'll try some experiments without this check.

| > 3. If the rgrp currently has a glock holder, we know we need to
| >     wait before we can lock it, regardless of whether the holder
| >     represents an actual holder or a waiter.
| That is probably wrong. We ought to be able to to multiple allocations
| in an rgrp if the node is holding the rgrp lock, without having to
| constantly queue and requeue locks. In other words, if we hold the rgrp
| in excl mode, then a useful extension would be to allow multiple users
| to perform allocations in parallel. So the exclusive rgrp glock becomes
| locally sharable - that is the real issue here I think.

That sounds like a really good enhancement, but out of the scope of
this patch.
Yes, thats true. However the patch as it stands does a lot of different things and those things need to be separated out and looked at individually. Which ones really make the performance difference in this case?

Doing so many things in a single patch means that it is very tricky if a regression is noticed later, since rolling back is either all or nothing. These things definitely need splitting up into smaller changes.

| > 4. If the rgrp currently has a multi-block reservation, and we
| >     already know it's not ours, then intranode contention is
| >     likely.
| What do you mean by "not ours"? In general each node only sees the
| reservations which are taken out on the node itself.

What I mean is: is the reservation associated with the current inode?
If the current inode for which we're allocating blocks has a current
reservation associated with it (gfs2_rs_active(rs) where rs is ip->i_res)
then we've obliged to use it. If not, we go out searching for a new
reservation in the rgrp. If the rgrp has a reservation queued from
a different inode (so its rgrp->rd_rstree is not empty) then another
process on this node has queued a reservation to this rgrp for the
purposes of a different file.
The current allocation algorithm tries to put related files (i.e. those with the same parent directory) near each other. That is not an unreasonable plan, and if there were multiple inodes with the same parent directory being written to at the same time, it is not a good plan to separate them too far. This is really just a workaround for the bigger issue of not being able to have multiple local processes using the same rgrp at the same time. I think that needs looking at again,

Steve.

Reply via email to