On Fri, 11 Apr 2008, David Malone wrote:

On Fri, Apr 11, 2008 at 09:48:12AM +0000, Jeff Roberson wrote:
   - Use a lockmgr lock rather than a mtx to protect dirhash.  This lock
     may be held for the duration of the various dirhash operations which
     avoids many complex unlock/lock/revalidate sequences.
   - Permit shared locks on lookup.  To protect the ip->i_dirhash pointer we
     use the vnode interlock in the shared case.  Callers holding the
     exclusive vnode lock can run without fear of concurrent modification to
     i_dirhash.
   - Hold an exclusive dirhash lock when creating the dirhash structure for
     the first time or when re-creating a dirhash structure which has been
     recycled.

Hi Jeff,

I've been reading through this patch to understand it. I've a few
questions:

        1) You initialise a chunk of the dir hash struct earlier
        now, though it was previously left until we knew memory
        allocation was successful. Is there a reason for that?

Now instead of freeing a recycled dirhash pointer in ip->i_dirhash I lock it and re-create it. This makes various races simpler. The dirhash has to be minimally constructed for dirhash_free_locked to function correctly if we bail out early.


        2) Is ufsdirhash_create() trying harder than before to
        allocate a dirhash? It looks like it might be, but maybe
        that's just a feature of the new locking.

The conditions in ufsdirhash_build() which result in creating a new directory hash remain the same except we are more agressive about reclaiming when the sysctl is lowered to ease testing.

ufsdirhash_create() is a new function which tries to lock or allocate a dirhash. This is complex because i_dirhash must be protected with the vnode interlock if the directory lock is held shared. This code has to deal with concurrent creation as well as concurrent destruction by ufsdirhash_recycle().


        3) You've added a dh_memreq member to the structure to
        remember a value that's cheap to calculate from other
        structure members and only required at allocation and free
        time. I can understand why you'd want to avoid repeating
        the code, but it would seem better to make it a macro rather
        than storing it for the lifetime of the object?

Earlier versions of this patch suffered various accounting errors with the space used by the i_dirhash structure. Since I may call ufsdirhash_free_locked() before space is accounted for keeping the integer was much simpler.


        4) You replaced an unlocked read with a locked read in
        ufsdirhash_lookup. I think this is because you are now
        locking the dh_score with the list mutex and are taking
        advantage of the fact that the score is changed just below?
        Won't this mean that for a sequence of operations on one
        directory, we'll now need to lock the list mutex for each
        lookup and get the lockmanager lock on the dirhash. At face
        value, this would seems to be worse than the old situation
        of only getting the dh mutex for each opteation?

In the normal ufs_lookup() path we enter 3 dirhash functions which can now assume the lock is held rather than locking/unlocking. And in lookup in particular we are no longer required to drop and reacquire the lock, potentially multiple times. I'm sure it's a net reduction in lock operations. That really wasn't the primary motivation however. I mostly wanted to enable shared locking of directories. I doubt the number of mtx operations is dominating the performance of directory operations in any event.


       In the case of multiple directories, I guess we'll have two
       mutex locks replaced by one mutex and one lockmanager
       (shared) lock? This is probably the usual case, which isn't
        too bad.

I'm not sure I understand this.


        5) It looks like the recycle policy (and locking thereof)
        is now slightly different and I think it might do something
        funny in some cases now.  If we get the list lock in
        ufsdirhash_recycle, but someone holds an exclusive lock on
        some of the dirhashes, then we will walk over the locked
        ones until we find an unlocked one and free it. We then
        jump back to the start of the list and have to walk over
        the locked ones again. I wonder if in a low memory situation
        the locked nodes could be actually waiting for the list
        lock in ufsdirhash_list_locked() and there could be some
        sort of cascade where you end up freeing dirhashes that
        should actually be kept. Actually, this probably isn't
        possible because the list lock is dropped while we're doing
        the free?

Because we drop the list lock we have to restart the processing from the beginning. I guess it is a risk that we could have many locked dirhashes that we'll have to skip causing the recycle loop to potentially take a long time. However, the old code was strange here as well. It'd only examine the head of the list and only recycled if the score reached 0.

Really we could just make this a simple LRU and be done with it. The score seems redundant.

Jeff


David.

_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to