It is worth mentioning that a system with 1TB worth of 512-byte ARC buffers would again favor an AVL tree implementation unless you again increase the size of the hash table even more.
On Oct 16, 2013, at 5:56 PM, Saso Kiselkov <[email protected]> wrote: > Just to be on the record on the list, I'd like to add one more thing I > wrote to Richard which didn't make it to the mailing list (Richard: I > updated some numbers after running some more calculations): > > For instance, let's say you're trying to represent 1TB worth of 4k ARC > buffers, i.e. 268435456 buffers (doesn't matter if they sit in ARC or > L2ARC, the headers are still there). > > Some important constants used below: > sizeof (void *) = 8 > sizeof (avl_node_t) = 24 > sizeof (avl_tree_t) = 40 > > Using an AVL tree you'll need to spend (best case): > 2^40 / 4096 * sizeof (avl_node_t) = 6 GB > > Using a linked list, it's going to be: > 2^40 / 4096 * sizeof (void *) = 2 GB > > Now, how large is the hash table going to be? The minimum amount of > physical DRAM to hold that many buffers is 96GB (268435456 buffers * > sizeof ((arc_buf_hdr_t) + sizeof (l2arc_buf_hdr_t)) comes in at ~74GB) > If you dedicate 0.1% to your hash table, that comes to: > > Number of linked list "head" pointers that fit in: > 96 * 2^20 / sizeof (void *) = 12582912 > > Number of AVL trees that fit in: > 96 * 2^20 / sizeof (avl_tree_t) = 2516582 > > Thus the average linked list chain length will be 268435456 / 12582912 = > 21.33, whereas the average tree size would be 268435456 / 2516582 = > 106.66. So far, seemingly we'd be justified in saying that AVL trees > will be better, because O(n) for n=21.33 is greater than O(log(n)) for > n=106.66 ('echo "l(106.66)/l(2)" | bc -l' gives 6.73), so even > considering a fixed callback overhead on the AVL tree comparator, it > should still be somewhat faster. > > BUT BUT BUT (!) - and here's the real kicker - we expended three times > as much memory in the first step to get this performance (6GB for trees > vs. 2 GB for linked lists). What if we instead chose to reinvest a part > of that overhead into the hash table and get shorter lists? What if, > instead of a 96MB (0.1% of physmem) we chose to go with a 384MB (0.4% of > physmem) hash table? That gives us 4x as many hash buckets and thus a > linked list chain length of on average 1/4 the original, i.e. 5.33. Now > that's shorter than the AVL tree case and at a fraction of the total > memory requirements! > > The only difference really is that the fixed memory cost of AVL trees > when you're not using them can be made very small (e.g. 96MB in the > above example, vs. 384MB for hash tables). But if you *do* use it a lot > (as one would expect on a high-volume machine), the variable memory cost > of AVL tree nodes quickly overtakes the total cost of linked lists and > it doesn't even give you any extra performance. > > Cheers, > -- > Saso _______________________________________________ developer mailing list [email protected] http://lists.open-zfs.org/mailman/listinfo/developer
