On 12.11.2013 18:12, Saso Kiselkov wrote:
On 11/12/13, 4:08 PM, Alexander Motin wrote:
On 12.11.2013 17:51, Saso Kiselkov wrote:
On 11/12/13, 3:38 PM, Alexander Motin wrote:
Hi.

While doing some performance tests I've found that LZ4 compression in
ZFS on FreeBSD each time allocates hash memory directly from VM, that on
multi-core system under significant load may consume more CPU time then
the compression itself. On 64-bit illumos that memory is allocated on
stack, but FreeBSD's kernel stack is smaller and has no sufficient space
(16K). I've made quite simple patch to reduce the allocation overhead by
creating allocation cache, same as it is done for ZIO. While for 64bit
illumos this patch is a nop, smaller architectures may still benefit
from it, same as FreeBSD does.

Any comments about it: http://people.freebsd.org/~mav/lz4_alloc.patch ?

After a bit of benchmarking Illumos switched to using kmem_alloc for LZ4
compression as well (discarding the stack allocations, because they were
fragile and didn't do much for performance). It'd be interesting to see
why kmem operations on FreeBSD are so inefficient under load - perhaps
some worthwhile refactoring work there?

Because allocations above page size (16K > 4K) are not cached by
allocator. Probably it could be improved and some work is going on
there, but as I can see illumos in case of ZIO in ZFS also explicitly
uses kmem_cache_create() to handle probably alike issues.

Or can you please post more details of your testing setup?

That was SPEC 2008 NFS benchmark on 2x6x2-core Xeon system, quickly
creating huge amount of files sized from 1K to several megabytes on FS
with LZ4 compression enabled. Without this patch profiler shown me about
20% of adaptive lock spinning around free call, doing also TLB
invalidation on all CPU cores. With this patch I see no any issues from
allocation at all.


Interesting. Could you try switching to using an explicit kmem cache? I
considered doing this when changing the implementation in Illumos, but I
saw no performance benefits. If they are there when the system is under
memory pressure, then it's certainly something we'd like to fix on all
platforms.

Ah, I see illumos just has cache zones up to 128K, while FreeBSD only up to 4K. That explains why you had no problems here with 16K hash allocation. So for illumos my patch I guess will be just a cosmetics, changing only accounting, while for FreeBSD it is important.

--
Alexander Motin
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to