On 29.11.2013 19:04, Saso Kiselkov wrote:
On 11/29/13, 3:00 PM, Alexander Motin wrote:
On 29.11.2013 16:49, Saso Kiselkov wrote:
On 11/29/13, 2:31 PM, Alexander Motin wrote:
On 29.11.2013 16:16, Saso Kiselkov wrote:
On 11/29/13, 2:00 PM, Alexander Motin wrote:
Hi.

Running some benchmarks and profiles for backing block storage (iSCSI,
etc) with ZFS files with recordsize=4K on 24-core FreeBSD machine I've
noticed significant lock congestion on ZFS SPA config locks inside
spa_config_enter() and spa_config_exit().  The source of it is
numerous
bp_get_dsize() calls, which acquire and drop SCL_VDEV reader lock,
while
called for every written block. Even if the lock scope is very
small, so
many acquisitions predictably cause congestion. The more CPUs
system has
the heavier congestion is. And since these locks are adaptive on
FreeBSD
-- I am getting heavy lock spinning, burning up to half of the CPU
time.

Is this problem known on other platforms?

I've made a patch to replace mutex locks there with rwlocks, acquiring
them for read in case SPA config lock is requested for read. On my
tests
this change doubles benchmark results and completely removes lock
congestion from SPA config locks since write acquisitions there don't
happen so often.

On FreeBSD, due to difference in memory allocation semantics, both
Solaris mutex and rwlock primitives are in fact emulated with the same
sxlock primitive now, so my patch just uses functionality that is
there
any way. On illumos I suppose there is some difference, but I guess
this
patch still should give benefits since these accesses indeed can be
shared and that is what shared locks are for.

Any comments about: http://people.freebsd.org/~mav/spa_shared.patch ?

How do you handle the cv_wait usage of scl_lock in spa_misc.c? Is
FreeBSD agnostic to using a rwlock instead of a mutex in condition
variable operations?

Yes, cv_wait() is lock type agnostic on FreeBSD.

Hm, taking a step back, I'm wondering if it wouldn't make sense to
rewrite spa_config_enter/spa_config_exit completely in terms of
rwlock_t's instead of doing our own "poor man's" implementation of them.
As far as I can discern it, the semantics appear to be identical.

AFAIK the main problem is that those SPA config locks may be acquired by
one thread, while dropped by another. At least on FreeBSD system locking
primitives don't allow that.

This appears to work: http://37.153.99.61/spa_config_rwlock/
The only kind-of special requirement for this code is that rw_write_held
test that the lock is held by not just any thread, but by curthread
(mirroring mutex_held() in a sense).

Then probably we need somebody who knows history of this code in Solaris or detailed rwlocks semantics. From quick look on illumos code I haven't found good reason why reader unlock from different thread may not work (while for writer that is denied even in present code), but I suppose there should be some good reason to implement things in this custom way. Is there some kind of lock debugging facility in illumos kernel? Are they agree to live with this too?

--
Alexander Motin
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to