Re: [OpenZFS Developer] [Developer] [zfs] Re: [Review] #3525 Persistent L2ARC - UPDATED

Saso Kiselkov Tue, 15 Oct 2013 02:50:46 -0700

On 10/15/13 3:40 AM, Richard Yao wrote:
> On 10/14/2013 10:23 PM, Richard Yao wrote:
>> I hope to get ZFS into a reasonable state on 32-bit Linux soon with
>> patches to restructure the ZIO buffers to use arrays of pages instead of
>> slab objects. That should also eliminate the need for the hacks that
>> currently exist in the SPL and get our allocation sizes down to no more
>> than two contiguous 4KB pages (although I would be much happier to get
>> everything below one 4KB page). With that in mind, I would greatly
>> prefer to see the hash table implemented inside a b-tree to avoid the
>> use of gigabyte-sized memory allocations (i.e. doing virtual memory
>> manually). Illumos can be expected to handle such large allocations,
>> even on 32-bit systems, but other kernels cannot.
> 
> On second thought, there is no reason why other ZFS consumers should
> suffer for Linux's deficiencies. We could always disable persistent
> L2ARC by default on 32-bit Linux and warn users of the pitfalls. If
> things do not work out well on 64-bit Linux, we could do the same there too.
> 
> If there is not already an option for disabling persistent L2ARC by
> default, it would be greatly appreciated if you would take the time to
> add it. That way this code could be merged into ZFSOnLinux with the
> addition of a few preprocessor directives to alter the default behavior.


This isn't concerning L2ARC per se but rather the restructuring or
refactoring of the ARC hash table implementation. At present the code
kmem_zalloc's a single block who's size is 128kB for each 1 GB of
physical memory (i.e. on machines with 128GB of physmem it allocates a
chunk of 16MB, on a machine with 1 TB it's 128MB, etc.). What we'd like
to do is modify this algorithm to introduce a kernel tunable that allows
an administrator to modify the hash table size (instead of it being
hard-coded) and expand its size to 1MB/1GB of memory (so 128GB of
physmem => 128MB hash table, 1024GB of physmem => 1GB hash table).

Expecting that allocations this large (and larger as machines with 8TB+
of physmem become available and widespread) could cause trouble down the
road I modified the ARC hash table implementation to make it a 2D hash
table, thus lowering the length of any particular memory request to at
most sqrt(HT_size). Matt argues that my original assumption and this
added complexity are unnecessary and I honestly simply don't know.

What do you think?

Cheers,
-- 
Saso
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Re: [OpenZFS Developer] [Developer] [zfs] Re: [Review] #3525 Persistent L2ARC - UPDATED

Reply via email to