On 11/29/13, 3:00 PM, Alexander Motin wrote: > On 29.11.2013 16:49, Saso Kiselkov wrote: >> On 11/29/13, 2:31 PM, Alexander Motin wrote: >>> On 29.11.2013 16:16, Saso Kiselkov wrote: >>>> On 11/29/13, 2:00 PM, Alexander Motin wrote: >>>>> Hi. >>>>> >>>>> Running some benchmarks and profiles for backing block storage (iSCSI, >>>>> etc) with ZFS files with recordsize=4K on 24-core FreeBSD machine I've >>>>> noticed significant lock congestion on ZFS SPA config locks inside >>>>> spa_config_enter() and spa_config_exit(). The source of it is >>>>> numerous >>>>> bp_get_dsize() calls, which acquire and drop SCL_VDEV reader lock, >>>>> while >>>>> called for every written block. Even if the lock scope is very >>>>> small, so >>>>> many acquisitions predictably cause congestion. The more CPUs >>>>> system has >>>>> the heavier congestion is. And since these locks are adaptive on >>>>> FreeBSD >>>>> -- I am getting heavy lock spinning, burning up to half of the CPU >>>>> time. >>>>> >>>>> Is this problem known on other platforms? >>>>> >>>>> I've made a patch to replace mutex locks there with rwlocks, acquiring >>>>> them for read in case SPA config lock is requested for read. On my >>>>> tests >>>>> this change doubles benchmark results and completely removes lock >>>>> congestion from SPA config locks since write acquisitions there don't >>>>> happen so often. >>>>> >>>>> On FreeBSD, due to difference in memory allocation semantics, both >>>>> Solaris mutex and rwlock primitives are in fact emulated with the same >>>>> sxlock primitive now, so my patch just uses functionality that is >>>>> there >>>>> any way. On illumos I suppose there is some difference, but I guess >>>>> this >>>>> patch still should give benefits since these accesses indeed can be >>>>> shared and that is what shared locks are for. >>>>> >>>>> Any comments about: http://people.freebsd.org/~mav/spa_shared.patch ? >>>> >>>> How do you handle the cv_wait usage of scl_lock in spa_misc.c? Is >>>> FreeBSD agnostic to using a rwlock instead of a mutex in condition >>>> variable operations? >>> >>> Yes, cv_wait() is lock type agnostic on FreeBSD. >> >> Hm, taking a step back, I'm wondering if it wouldn't make sense to >> rewrite spa_config_enter/spa_config_exit completely in terms of >> rwlock_t's instead of doing our own "poor man's" implementation of them. >> As far as I can discern it, the semantics appear to be identical. > > AFAIK the main problem is that those SPA config locks may be acquired by > one thread, while dropped by another. At least on FreeBSD system locking > primitives don't allow that. >
This appears to work: http://37.153.99.61/spa_config_rwlock/ The only kind-of special requirement for this code is that rw_write_held test that the lock is held by not just any thread, but by curthread (mirroring mutex_held() in a sense). Cheers, -- Saso _______________________________________________ developer mailing list [email protected] http://lists.open-zfs.org/mailman/listinfo/developer
