On 12.11.2013 18:22, Matthew Ahrens wrote:
On Tue, Nov 12, 2013 at 7:51 AM, Saso Kiselkov <[email protected] <mailto:[email protected]>> wrote:On 11/12/13, 3:38 PM, Alexander Motin wrote: > Hi. > > While doing some performance tests I've found that LZ4 compression in > ZFS on FreeBSD each time allocates hash memory directly from VM, that on > multi-core system under significant load may consume more CPU time then > the compression itself. On 64-bit illumos that memory is allocated on > stack, but FreeBSD's kernel stack is smaller and has no sufficient space > (16K). I've made quite simple patch to reduce the allocation overhead by > creating allocation cache, same as it is done for ZIO. While for 64bit > illumos this patch is a nop, smaller architectures may still benefit > from it, same as FreeBSD does. > > Any comments about it: http://people.freebsd.org/~mav/lz4_alloc.patch ? > After a bit of benchmarking Illumos switched to using kmem_alloc for LZ4 compression as well (discarding the stack allocations, because they were fragile and didn't do much for performance). It'd be interesting to see why kmem operations on FreeBSD are so inefficient under load - perhaps some worthwhile refactoring work there? Or can you please post more details of your testing setup? My understanding is that on FreeBSD, kmem_cache_alloc() uses uma_zalloc_arg(), which has fast, per-CPU caches of free buffers (like illumos). But on FreeBSD, kmem_alloc() uses malloc(), which is slower (whereas on illumos, kmem_alloc() just calls kmem_cache_alloc() from an appropriately-sized cache). See sys/cddl/compat/opensolaris/kern/opensolaris_kmem.c for details. Does anyone know the reasoning behind this? I.e. why kmem_alloc() does not have similar performance characteristics on FreeBSD as on illumos?
FreeBSD malloc() does use uma_zalloc_arg() caches for small allocations. For big it is less usable because large per-CPU caches tend to eat too much extra memory and it is quite hard to purge those per-CPU caches in low-memory condition. But considering that illumos at all has kmem_cache_alloc() KPI there is also probably should be some difference from plain kmem_alloc().
-- Alexander Motin _______________________________________________ developer mailing list [email protected] http://lists.open-zfs.org/mailman/listinfo/developer
