On 11/29/13, 2:31 PM, Alexander Motin wrote:
> On 29.11.2013 16:16, Saso Kiselkov wrote:
>> On 11/29/13, 2:00 PM, Alexander Motin wrote:
>>> Hi.
>>>
>>> Running some benchmarks and profiles for backing block storage (iSCSI,
>>> etc) with ZFS files with recordsize=4K on 24-core FreeBSD machine I've
>>> noticed significant lock congestion on ZFS SPA config locks inside
>>> spa_config_enter() and spa_config_exit().  The source of it is numerous
>>> bp_get_dsize() calls, which acquire and drop SCL_VDEV reader lock, while
>>> called for every written block. Even if the lock scope is very small, so
>>> many acquisitions predictably cause congestion. The more CPUs system has
>>> the heavier congestion is. And since these locks are adaptive on FreeBSD
>>> -- I am getting heavy lock spinning, burning up to half of the CPU time.
>>>
>>> Is this problem known on other platforms?
>>>
>>> I've made a patch to replace mutex locks there with rwlocks, acquiring
>>> them for read in case SPA config lock is requested for read. On my tests
>>> this change doubles benchmark results and completely removes lock
>>> congestion from SPA config locks since write acquisitions there don't
>>> happen so often.
>>>
>>> On FreeBSD, due to difference in memory allocation semantics, both
>>> Solaris mutex and rwlock primitives are in fact emulated with the same
>>> sxlock primitive now, so my patch just uses functionality that is there
>>> any way. On illumos I suppose there is some difference, but I guess this
>>> patch still should give benefits since these accesses indeed can be
>>> shared and that is what shared locks are for.
>>>
>>> Any comments about: http://people.freebsd.org/~mav/spa_shared.patch ?
>>
>> How do you handle the cv_wait usage of scl_lock in spa_misc.c? Is
>> FreeBSD agnostic to using a rwlock instead of a mutex in condition
>> variable operations?
> 
> Yes, cv_wait() is lock type agnostic on FreeBSD.

Hm, taking a step back, I'm wondering if it wouldn't make sense to
rewrite spa_config_enter/spa_config_exit completely in terms of
rwlock_t's instead of doing our own "poor man's" implementation of them.
As far as I can discern it, the semantics appear to be identical.

Cheers,
-- 
Saso

_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to