On 11/29/13, 3:00 PM, Alexander Motin wrote:
> On 29.11.2013 16:49, Saso Kiselkov wrote:
>> On 11/29/13, 2:31 PM, Alexander Motin wrote:
>>> On 29.11.2013 16:16, Saso Kiselkov wrote:
>>>> On 11/29/13, 2:00 PM, Alexander Motin wrote:
>>>>> Hi.
>>>>>
>>>>> Running some benchmarks and profiles for backing block storage (iSCSI,
>>>>> etc) with ZFS files with recordsize=4K on 24-core FreeBSD machine I've
>>>>> noticed significant lock congestion on ZFS SPA config locks inside
>>>>> spa_config_enter() and spa_config_exit().  The source of it is
>>>>> numerous
>>>>> bp_get_dsize() calls, which acquire and drop SCL_VDEV reader lock,
>>>>> while
>>>>> called for every written block. Even if the lock scope is very
>>>>> small, so
>>>>> many acquisitions predictably cause congestion. The more CPUs
>>>>> system has
>>>>> the heavier congestion is. And since these locks are adaptive on
>>>>> FreeBSD
>>>>> -- I am getting heavy lock spinning, burning up to half of the CPU
>>>>> time.
>>>>>
>>>>> Is this problem known on other platforms?
>>>>>
>>>>> I've made a patch to replace mutex locks there with rwlocks, acquiring
>>>>> them for read in case SPA config lock is requested for read. On my
>>>>> tests
>>>>> this change doubles benchmark results and completely removes lock
>>>>> congestion from SPA config locks since write acquisitions there don't
>>>>> happen so often.
>>>>>
>>>>> On FreeBSD, due to difference in memory allocation semantics, both
>>>>> Solaris mutex and rwlock primitives are in fact emulated with the same
>>>>> sxlock primitive now, so my patch just uses functionality that is
>>>>> there
>>>>> any way. On illumos I suppose there is some difference, but I guess
>>>>> this
>>>>> patch still should give benefits since these accesses indeed can be
>>>>> shared and that is what shared locks are for.
>>>>>
>>>>> Any comments about: http://people.freebsd.org/~mav/spa_shared.patch ?
>>>>
>>>> How do you handle the cv_wait usage of scl_lock in spa_misc.c? Is
>>>> FreeBSD agnostic to using a rwlock instead of a mutex in condition
>>>> variable operations?
>>>
>>> Yes, cv_wait() is lock type agnostic on FreeBSD.
>>
>> Hm, taking a step back, I'm wondering if it wouldn't make sense to
>> rewrite spa_config_enter/spa_config_exit completely in terms of
>> rwlock_t's instead of doing our own "poor man's" implementation of them.
>> As far as I can discern it, the semantics appear to be identical.
> 
> AFAIK the main problem is that those SPA config locks may be acquired by
> one thread, while dropped by another. At least on FreeBSD system locking
> primitives don't allow that.
> 

This appears to work: http://37.153.99.61/spa_config_rwlock/
The only kind-of special requirement for this code is that rw_write_held
test that the lock is held by not just any thread, but by curthread
(mirroring mutex_held() in a sense).

Cheers,
-- 
Saso
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to