On 11/29/13, 2:31 PM, Alexander Motin wrote: > On 29.11.2013 16:16, Saso Kiselkov wrote: >> On 11/29/13, 2:00 PM, Alexander Motin wrote: >>> Hi. >>> >>> Running some benchmarks and profiles for backing block storage (iSCSI, >>> etc) with ZFS files with recordsize=4K on 24-core FreeBSD machine I've >>> noticed significant lock congestion on ZFS SPA config locks inside >>> spa_config_enter() and spa_config_exit(). The source of it is numerous >>> bp_get_dsize() calls, which acquire and drop SCL_VDEV reader lock, while >>> called for every written block. Even if the lock scope is very small, so >>> many acquisitions predictably cause congestion. The more CPUs system has >>> the heavier congestion is. And since these locks are adaptive on FreeBSD >>> -- I am getting heavy lock spinning, burning up to half of the CPU time. >>> >>> Is this problem known on other platforms? >>> >>> I've made a patch to replace mutex locks there with rwlocks, acquiring >>> them for read in case SPA config lock is requested for read. On my tests >>> this change doubles benchmark results and completely removes lock >>> congestion from SPA config locks since write acquisitions there don't >>> happen so often. >>> >>> On FreeBSD, due to difference in memory allocation semantics, both >>> Solaris mutex and rwlock primitives are in fact emulated with the same >>> sxlock primitive now, so my patch just uses functionality that is there >>> any way. On illumos I suppose there is some difference, but I guess this >>> patch still should give benefits since these accesses indeed can be >>> shared and that is what shared locks are for. >>> >>> Any comments about: http://people.freebsd.org/~mav/spa_shared.patch ? >> >> How do you handle the cv_wait usage of scl_lock in spa_misc.c? Is >> FreeBSD agnostic to using a rwlock instead of a mutex in condition >> variable operations? > > Yes, cv_wait() is lock type agnostic on FreeBSD.
Hm, taking a step back, I'm wondering if it wouldn't make sense to rewrite spa_config_enter/spa_config_exit completely in terms of rwlock_t's instead of doing our own "poor man's" implementation of them. As far as I can discern it, the semantics appear to be identical. Cheers, -- Saso _______________________________________________ developer mailing list [email protected] http://lists.open-zfs.org/mailman/listinfo/developer
