On Tue, Oct 15, 2013 at 10:47:22PM +0100, Saso Kiselkov wrote:
> On 10/15/13 10:26 PM, Prakash Surya wrote:
> > On Sun, Oct 13, 2013 at 12:21:05AM +0100, Saso Kiselkov wrote:
> >> The performance gains from this are pretty substantial in my testing.
> >> The time needed to rebuild 93 GB worth of ARC buffers consisting of a
> >> mixture of 128k and 8k blocks (total number of ARC buffers: 6.4 million,
> >> or ~15.4k average block size) decreased from ~50s to about 9.5s (with a
> >> small portion of that being physical L2ARC read latency).
> > 
> > Where exactly did the performance improvement come from? I'm not
> > doubting the work, I'm just curious. Is it faster because the hash
> > chains are shorter now (i.e. less "work" done under bucket lock leading
> > to better concurrency)? Or something else?
> > 
> > If the improvement does comes indeed come from increased concurrency due
> > to shorter hash chains, I'd imagine this would benefit workloads other
> > than L2ARC rebuild times.
> > 
> > As such, It would be nice to have a before/after workload showing the
> > better performance numbers without having to rely on the persistent
> > L2ARC changes. Especially since I see this work being split into its
> > own independent patch, as posted to the open-zfs list.
> 
> The performance improvement has nothing to do with locking or
> concurrency, it is simply due to the fact that we have to do less work
> to find a particular buffer in the hash table.
> 

OK, that is where I assumed the speed up was coming from (shorter
chains leading to faster lookups).

I also assumed there would be a "bucket lock" that needs to be acquired
during this lookup similar to the dbuf hash (which would affect
concurrency), but I guess that isn't the case here (I haven't studied
the arc hash code as well as I have the dbuf hash code).

> I hope you won't find it presumptuous on my part if I believe you are
> not very familiar with how the ARC hash table (or perhaps any hash
> table?) works. So a quick recap:
> 
>  * All reads in ZFS go through arc_read() to cache buffers.
>  * The request specifies a buffer by DVA.
>  * Thus the ARC needs some method of locating buffers in the cache
>    by its DVA.
> 
> When the ARC reads in a buffer from the pool, it quickly computes a
> CRC64 "hash" from the buffer's DVA (plus a few other bits of info in the
> arc_read() request). This hash is then used as an index into the hash
> table. The items of the hash table are linked lists of ARC buffers. If
> there already is an ARC buffer at the index, the current buffer is
> appended to the list and we're done.
> 
> Similarly, if we're looking for a buffer by DVA, we compute the hash
> index, linked list at that index in the hash table until we find the
> buffer that exactly matches the search criteria. There may be multiple
> buffers in the chain due to hash collisions (i.e. multiple buffers with
> different DVAs hashing to the same index) - this is normal and expected.
> 
> For details see buf_hash_find(), buf_hash_insert() and buf_hash_remove().
> 
> What my change does is it expands the size of the hash table so that
> fewer hash collisions occur, leading to shorter buffer chains and faster
> lookups.

So, if this simply comes down to a hash collision issue, can't we try
and take this a bit further.. Can we make the hash size be completely
dynamic? Instead of using another heuristic, can we grow (and shrink?)
the hash as the workload demands? So if the max chain depth reaches a
threshold, we increase the number of hash buckets (e.g. double it).

Of course the details of performing the grow (i.e. rehash) operation
would need to be worked out so it doesn't affect performance,
consistency, etc.. But from a theoretical stand point, moving it to be
sized dynamically seems like a much better solution, IMO.

-- 
Cheers, Prakash

> 
> Cheers,
> -- 
> Saso
> 
> 
> -------------------------------------------
> illumos-zfs
> Archives: https://www.listbox.com/member/archive/182191/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/182191/23963346-4bb55813
> Modify Your Subscription: 
> https://www.listbox.com/member/?member_id=23963346&id_secret=23963346-89c22f02
> Powered by Listbox: http://www.listbox.com
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to