On Oct 15, 2013, at 9:57 PM, Saso Kiselkov <[email protected]> wrote:

> On 10/16/13 2:42 AM, Prakash Surya wrote:
>> 
>> So, if this simply comes down to a hash collision issue, can't we try
>> and take this a bit further.. Can we make the hash size be completely
>> dynamic? Instead of using another heuristic, can we grow (and shrink?)
>> the hash as the workload demands? So if the max chain depth reaches a
>> threshold, we increase the number of hash buckets (e.g. double it).
>> 
>> Of course the details of performing the grow (i.e. rehash) operation
>> would need to be worked out so it doesn't affect performance,
>> consistency, etc.. But from a theoretical stand point, moving it to be
>> sized dynamically seems like a much better solution, IMO.
> 
> Several problems with this approach:
> 
> 1) unpredictability - when do trigger it and by what criteria?
> 2) the extreme expense of rehashing everything (easily dozens of
>   seconds of one CPU pegged at 100% while everything else grinds
>   to a halt as the ARC is inaccessible)
> 3) hard to diagnose latency spikes due to problem #2
> 
> The easiest approach is just to give the admin a knob which they can
> twiddle at boot and then leave alone. If performance is really a
> problem, they can schedule downtime. Doing it at runtime has its
> problems and doing it automatically is dangerous.

Another possibility is to switch from linked lists to AVL trees. That would 
make the cost of collisions less expensive.
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to