On 10/15/13 3:40 AM, Richard Yao wrote: > On 10/14/2013 10:23 PM, Richard Yao wrote: >> I hope to get ZFS into a reasonable state on 32-bit Linux soon with >> patches to restructure the ZIO buffers to use arrays of pages instead of >> slab objects. That should also eliminate the need for the hacks that >> currently exist in the SPL and get our allocation sizes down to no more >> than two contiguous 4KB pages (although I would be much happier to get >> everything below one 4KB page). With that in mind, I would greatly >> prefer to see the hash table implemented inside a b-tree to avoid the >> use of gigabyte-sized memory allocations (i.e. doing virtual memory >> manually). Illumos can be expected to handle such large allocations, >> even on 32-bit systems, but other kernels cannot. > > On second thought, there is no reason why other ZFS consumers should > suffer for Linux's deficiencies. We could always disable persistent > L2ARC by default on 32-bit Linux and warn users of the pitfalls. If > things do not work out well on 64-bit Linux, we could do the same there too. > > If there is not already an option for disabling persistent L2ARC by > default, it would be greatly appreciated if you would take the time to > add it. That way this code could be merged into ZFSOnLinux with the > addition of a few preprocessor directives to alter the default behavior.
This isn't concerning L2ARC per se but rather the restructuring or refactoring of the ARC hash table implementation. At present the code kmem_zalloc's a single block who's size is 128kB for each 1 GB of physical memory (i.e. on machines with 128GB of physmem it allocates a chunk of 16MB, on a machine with 1 TB it's 128MB, etc.). What we'd like to do is modify this algorithm to introduce a kernel tunable that allows an administrator to modify the hash table size (instead of it being hard-coded) and expand its size to 1MB/1GB of memory (so 128GB of physmem => 128MB hash table, 1024GB of physmem => 1GB hash table). Expecting that allocations this large (and larger as machines with 8TB+ of physmem become available and widespread) could cause trouble down the road I modified the ARC hash table implementation to make it a 2D hash table, thus lowering the length of any particular memory request to at most sqrt(HT_size). Matt argues that my original assumption and this added complexity are unnecessary and I honestly simply don't know. What do you think? Cheers, -- Saso _______________________________________________ developer mailing list [email protected] http://lists.open-zfs.org/mailman/listinfo/developer
