It is worth mentioning that a system with 1TB worth of 512-byte ARC buffers 
would again favor an AVL tree implementation unless you again increase the size 
of the hash table even more.

On Oct 16, 2013, at 5:56 PM, Saso Kiselkov <[email protected]> wrote:

> Just to be on the record on the list, I'd like to add one more thing I
> wrote to Richard which didn't make it to the mailing list (Richard: I
> updated some numbers after running some more calculations):
> 
> For instance, let's say you're trying to represent 1TB worth of 4k ARC
> buffers, i.e. 268435456 buffers (doesn't matter if they sit in ARC or
> L2ARC, the headers are still there).
> 
> Some important constants used below:
>    sizeof (void *) =    8
>    sizeof (avl_node_t) =    24
>    sizeof (avl_tree_t) =    40
> 
> Using an AVL tree you'll need to spend (best case):
> 2^40 / 4096 * sizeof (avl_node_t) = 6 GB
> 
> Using a linked list, it's going to be:
> 2^40 / 4096 * sizeof (void *) = 2 GB
> 
> Now, how large is the hash table going to be? The minimum amount of
> physical DRAM to hold that many buffers is 96GB (268435456 buffers *
> sizeof ((arc_buf_hdr_t) + sizeof (l2arc_buf_hdr_t)) comes in at ~74GB)
> If you dedicate 0.1% to your hash table, that comes to:
> 
> Number of linked list "head" pointers that fit in:
> 96 * 2^20 / sizeof (void *) = 12582912
> 
> Number of AVL trees that fit in:
> 96 * 2^20 / sizeof (avl_tree_t) = 2516582
> 
> Thus the average linked list chain length will be 268435456 / 12582912 =
> 21.33, whereas the average tree size would be 268435456 / 2516582 =
> 106.66. So far, seemingly we'd be justified in saying that AVL trees
> will be better, because O(n) for n=21.33 is greater than O(log(n)) for
> n=106.66 ('echo "l(106.66)/l(2)" | bc -l' gives 6.73), so even
> considering a fixed callback overhead on the AVL tree comparator, it
> should still be somewhat faster.
> 
> BUT BUT BUT (!) - and here's the real kicker - we expended three times
> as much memory in the first step to get this performance (6GB for trees
> vs. 2 GB for linked lists). What if we instead chose to reinvest a part
> of that overhead into the hash table and get shorter lists? What if,
> instead of a 96MB (0.1% of physmem) we chose to go with a 384MB (0.4% of
> physmem) hash table? That gives us 4x as many hash buckets and thus a
> linked list chain length of on average 1/4 the original, i.e. 5.33. Now
> that's shorter than the AVL tree case and at a fraction of the total
> memory requirements!
> 
> The only difference really is that the fixed memory cost of AVL trees
> when you're not using them can be made very small (e.g. 96MB in the
> above example, vs. 384MB for hash tables). But if you *do* use it a lot
> (as one would expect on a high-volume machine), the variable memory cost
> of AVL tree nodes quickly overtakes the total cost of linked lists and
> it doesn't even give you any extra performance.
> 
> Cheers,
> -- 
> Saso
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to