On 10/16/13 2:42 AM, Prakash Surya wrote: > OK, that is where I assumed the speed up was coming from (shorter > chains leading to faster lookups). > > I also assumed there would be a "bucket lock" that needs to be acquired > during this lookup similar to the dbuf hash (which would affect > concurrency), but I guess that isn't the case here (I haven't studied > the arc hash code as well as I have the dbuf hash code).
It's mostly the same. Dbuf would also benefit a bit from restructuring, perhaps. > So, if this simply comes down to a hash collision issue, can't we try > and take this a bit further.. Can we make the hash size be completely > dynamic? Instead of using another heuristic, can we grow (and shrink?) > the hash as the workload demands? So if the max chain depth reaches a > threshold, we increase the number of hash buckets (e.g. double it). > > Of course the details of performing the grow (i.e. rehash) operation > would need to be worked out so it doesn't affect performance, > consistency, etc.. But from a theoretical stand point, moving it to be > sized dynamically seems like a much better solution, IMO. Several problems with this approach: 1) unpredictability - when do trigger it and by what criteria? 2) the extreme expense of rehashing everything (easily dozens of seconds of one CPU pegged at 100% while everything else grinds to a halt as the ARC is inaccessible) 3) hard to diagnose latency spikes due to problem #2 The easiest approach is just to give the admin a knob which they can twiddle at boot and then leave alone. If performance is really a problem, they can schedule downtime. Doing it at runtime has its problems and doing it automatically is dangerous. Cheers, -- Saso _______________________________________________ developer mailing list [email protected] http://lists.open-zfs.org/mailman/listinfo/developer
