On 11/12/13, 4:16 PM, Alexander Motin wrote:
> On 12.11.2013 18:12, Saso Kiselkov wrote:
>> On 11/12/13, 4:08 PM, Alexander Motin wrote:
>>> On 12.11.2013 17:51, Saso Kiselkov wrote:
>>>> On 11/12/13, 3:38 PM, Alexander Motin wrote:
>>>>> Hi.
>>>>>
>>>>> While doing some performance tests I've found that LZ4 compression in
>>>>> ZFS on FreeBSD each time allocates hash memory directly from VM,
>>>>> that on
>>>>> multi-core system under significant load may consume more CPU time
>>>>> then
>>>>> the compression itself. On 64-bit illumos that memory is allocated on
>>>>> stack, but FreeBSD's kernel stack is smaller and has no sufficient
>>>>> space
>>>>> (16K). I've made quite simple patch to reduce the allocation
>>>>> overhead by
>>>>> creating allocation cache, same as it is done for ZIO. While for 64bit
>>>>> illumos this patch is a nop, smaller architectures may still benefit
>>>>> from it, same as FreeBSD does.
>>>>>
>>>>> Any comments about it:
>>>>> http://people.freebsd.org/~mav/lz4_alloc.patch ?
>>>>
>>>> After a bit of benchmarking Illumos switched to using kmem_alloc for
>>>> LZ4
>>>> compression as well (discarding the stack allocations, because they
>>>> were
>>>> fragile and didn't do much for performance). It'd be interesting to see
>>>> why kmem operations on FreeBSD are so inefficient under load - perhaps
>>>> some worthwhile refactoring work there?
>>>
>>> Because allocations above page size (16K > 4K) are not cached by
>>> allocator. Probably it could be improved and some work is going on
>>> there, but as I can see illumos in case of ZIO in ZFS also explicitly
>>> uses kmem_cache_create() to handle probably alike issues.
>>>
>>>> Or can you please post more details of your testing setup?
>>>
>>> That was SPEC 2008 NFS benchmark on 2x6x2-core Xeon system, quickly
>>> creating huge amount of files sized from 1K to several megabytes on FS
>>> with LZ4 compression enabled. Without this patch profiler shown me about
>>> 20% of adaptive lock spinning around free call, doing also TLB
>>> invalidation on all CPU cores. With this patch I see no any issues from
>>> allocation at all.
>>>
>>
>> Interesting. Could you try switching to using an explicit kmem cache?
> 
> That is what I did in my patch. Or do you mean something else?

Sorry, got your change confused with what Pawel was suggesting (using an
enlarged stack). Looks good - you can even get rid of the HEAPMODE
conditionals there. We should always use heap/cache allocation and never
the unreliable stack stuff.

Cheers,
-- 
Saso
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to