Re: [OpenZFS Developer] ZFS SPA config locks congestion

George Wilson Wed, 04 Dec 2013 08:48:25 -0800

On 12/4/13 11:30 AM, Alexander Motin wrote:

On 04.12.2013 17:48, George Wilson wrote:

On 12/3/13 9:49 PM, Saso Kiselkov wrote:

On 12/4/13, 2:19 AM, Matthew Ahrens wrote:

On Fri, Nov 29, 2013 at 1:36 PM, Saso Kiselkov <[email protected]
<mailto:[email protected]>> wrote:
The only "good" reason I can think of is read-write locks notbeing
     present in the kernel when this was written.
rwlocks definitely existed (in the solaris kernel) when ZFS waswritten.
  I'm not sure why we used a hand-rolled reader-writer lock rather
than a
krwlock_t.  Maybe George knows?

I suspect that the problem could be related with not wanting to acquire
and subsequently drop a rwlock in different threads?

     Other than that, the
     spa_config_lock semantics are exactly the same as in krwlock_t.

> Is there some kind of lock debugging facility in illumoskernel?


     You'll have to be a bit more specific. There's plenty of lock
analysis
     and debugging tools, but that's really not what we're after
here. We're
     after the semantics. I think I understand the krwlock_t
semantics from
     looking at the code (and they appear identical to FreeBSD's,
though I
     haven't looked at Linux), and as far as I can discern, the code
in that

webrev gives the exact same behavior as the legacyspa_config_locks

     implementation.


Using a rwlock would lose the debugability of the refcount_t scl_count
(tag matching, ::refcount).  That may be acceptable, though.

Well, according to Alex, it does improve performance (though we have to
see more details on that still) and it simplifies the code a lot.

Cheers,

Sorry I'm coming late to the party. The reason for the special
spa_config_lock semantics is to ensure that a lock can be grabbed by one
thread and released by another. Changing to a plain rwlock
implementation would be problematic based on that requirement. I believe
that this is only an issue when you grab the rwlock as writer since it
tracks the thread that obtained the write lock in the locking structure.
I don't recall if we still have instances where we grab the lock as
writer and then release it by a different thread so I think it makes
sense to investigate making the change to a plain rwlock implementation.
This would require an extensive amount of testing especially when
dealing with configuration changes and cases where we deal with I/O
errors (vdev_probe code path).

I think it can be safely assumed that writer is always dropped by thesame thread because it is already asserted in spa_config_exit():


        if (refcount_remove(&scl->scl_count, tag) == 0) {
            ASSERT(scl->scl_writer == NULL ||
                scl->scl_writer == curthread);
            scl->scl_writer = NULL;    /* OK in either case */

Good point. I can't see any other reason why we shouldn't try to moveforward with a rwlock implementation.


- George

_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Re: [OpenZFS Developer] ZFS SPA config locks congestion

Reply via email to