On Tue, Nov 12, 2013 at 8:28 AM, Alexander Motin <[email protected]> wrote:

> On 12.11.2013 18:22, Matthew Ahrens wrote:
>
>> On Tue, Nov 12, 2013 at 7:51 AM, Saso Kiselkov <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>>     On 11/12/13, 3:38 PM, Alexander Motin wrote:
>>      > Hi.
>>      >
>>      > While doing some performance tests I've found that LZ4 compression
>> in
>>      > ZFS on FreeBSD each time allocates hash memory directly from VM,
>>     that on
>>      > multi-core system under significant load may consume more CPU
>>     time then
>>      > the compression itself. On 64-bit illumos that memory is allocated
>> on
>>      > stack, but FreeBSD's kernel stack is smaller and has no
>>     sufficient space
>>      > (16K). I've made quite simple patch to reduce the allocation
>>     overhead by
>>      > creating allocation cache, same as it is done for ZIO. While for
>>     64bit
>>      > illumos this patch is a nop, smaller architectures may still
>> benefit
>>      > from it, same as FreeBSD does.
>>      >
>>      > Any comments about it:
>>     http://people.freebsd.org/~mav/lz4_alloc.patch ?
>>      >
>>
>>     After a bit of benchmarking Illumos switched to using kmem_alloc for
>> LZ4
>>     compression as well (discarding the stack allocations, because they
>> were
>>     fragile and didn't do much for performance). It'd be interesting to
>> see
>>     why kmem operations on FreeBSD are so inefficient under load - perhaps
>>     some worthwhile refactoring work there? Or can you please post more
>>     details of your testing setup?
>>
>>
>> My understanding is that on FreeBSD, kmem_cache_alloc() uses
>> uma_zalloc_arg(), which has fast, per-CPU caches of free buffers (like
>> illumos).  But on FreeBSD, kmem_alloc() uses malloc(), which is slower
>> (whereas on illumos, kmem_alloc() just calls kmem_cache_alloc() from an
>> appropriately-sized cache).  See
>> sys/cddl/compat/opensolaris/kern/opensolaris_kmem.c for details.
>>
>> Does anyone know the reasoning behind this?  I.e. why kmem_alloc() does
>> not have similar performance characteristics on FreeBSD as on illumos?
>>
>
> FreeBSD malloc() does use uma_zalloc_arg() caches for small allocations.
> For big it is less usable because large per-CPU caches tend to eat too much
> extra memory and it is quite hard to purge those per-CPU caches in
> low-memory condition. But considering that illumos at all has
> kmem_cache_alloc() KPI there is also probably should be some difference
> from plain kmem_alloc().
>

Yes, primarily the ability to use constructors/destructors to save time
when allocating.  But it's true that illumos kmem_alloc() also falls back
on a slow path (vmem_alloc()) for large allocations -- those above 128KB.

--matt
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to