On Fri, 11 Apr 2008, David Malone wrote:
On Fri, Apr 11, 2008 at 09:48:12AM +0000, Jeff Roberson wrote:
- Use a lockmgr lock rather than a mtx to protect dirhash. This lock
may be held for the duration of the various dirhash operations which
avoids many complex unlock/lock/revalidate sequences.
- Permit shared locks on lookup. To protect the ip->i_dirhash pointer we
use the vnode interlock in the shared case. Callers holding the
exclusive vnode lock can run without fear of concurrent modification to
i_dirhash.
- Hold an exclusive dirhash lock when creating the dirhash structure for
the first time or when re-creating a dirhash structure which has been
recycled.
Hi Jeff,
I've been reading through this patch to understand it. I've a few
questions:
1) You initialise a chunk of the dir hash struct earlier
now, though it was previously left until we knew memory
allocation was successful. Is there a reason for that?
Now instead of freeing a recycled dirhash pointer in ip->i_dirhash I lock
it and re-create it. This makes various races simpler. The dirhash has
to be minimally constructed for dirhash_free_locked to function correctly
if we bail out early.
2) Is ufsdirhash_create() trying harder than before to
allocate a dirhash? It looks like it might be, but maybe
that's just a feature of the new locking.
The conditions in ufsdirhash_build() which result in creating a new
directory hash remain the same except we are more agressive about
reclaiming when the sysctl is lowered to ease testing.
ufsdirhash_create() is a new function which tries to lock or allocate a
dirhash. This is complex because i_dirhash must be protected with the
vnode interlock if the directory lock is held shared. This code has to
deal with concurrent creation as well as concurrent destruction by
ufsdirhash_recycle().
3) You've added a dh_memreq member to the structure to
remember a value that's cheap to calculate from other
structure members and only required at allocation and free
time. I can understand why you'd want to avoid repeating
the code, but it would seem better to make it a macro rather
than storing it for the lifetime of the object?
Earlier versions of this patch suffered various accounting errors with the
space used by the i_dirhash structure. Since I may call
ufsdirhash_free_locked() before space is accounted for keeping the integer
was much simpler.
4) You replaced an unlocked read with a locked read in
ufsdirhash_lookup. I think this is because you are now
locking the dh_score with the list mutex and are taking
advantage of the fact that the score is changed just below?
Won't this mean that for a sequence of operations on one
directory, we'll now need to lock the list mutex for each
lookup and get the lockmanager lock on the dirhash. At face
value, this would seems to be worse than the old situation
of only getting the dh mutex for each opteation?
In the normal ufs_lookup() path we enter 3 dirhash functions which can now
assume the lock is held rather than locking/unlocking. And in lookup in
particular we are no longer required to drop and reacquire the lock,
potentially multiple times. I'm sure it's a net reduction in lock
operations. That really wasn't the primary motivation however. I mostly
wanted to enable shared locking of directories. I doubt the number of mtx
operations is dominating the performance of directory operations in any
event.
In the case of multiple directories, I guess we'll have two
mutex locks replaced by one mutex and one lockmanager
(shared) lock? This is probably the usual case, which isn't
too bad.
I'm not sure I understand this.
5) It looks like the recycle policy (and locking thereof)
is now slightly different and I think it might do something
funny in some cases now. If we get the list lock in
ufsdirhash_recycle, but someone holds an exclusive lock on
some of the dirhashes, then we will walk over the locked
ones until we find an unlocked one and free it. We then
jump back to the start of the list and have to walk over
the locked ones again. I wonder if in a low memory situation
the locked nodes could be actually waiting for the list
lock in ufsdirhash_list_locked() and there could be some
sort of cascade where you end up freeing dirhashes that
should actually be kept. Actually, this probably isn't
possible because the list lock is dropped while we're doing
the free?
Because we drop the list lock we have to restart the processing from the
beginning. I guess it is a risk that we could have many locked dirhashes
that we'll have to skip causing the recycle loop to potentially take a
long time. However, the old code was strange here as well. It'd only
examine the head of the list and only recycled if the score reached 0.
Really we could just make this a simple LRU and be done with it. The
score seems redundant.
Jeff
David.
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"