On 10/16/13 2:42 AM, Prakash Surya wrote:
> OK, that is where I assumed the speed up was coming from (shorter
> chains leading to faster lookups).
> 
> I also assumed there would be a "bucket lock" that needs to be acquired
> during this lookup similar to the dbuf hash (which would affect
> concurrency), but I guess that isn't the case here (I haven't studied
> the arc hash code as well as I have the dbuf hash code).

It's mostly the same. Dbuf would also benefit a bit from restructuring,
perhaps.

> So, if this simply comes down to a hash collision issue, can't we try
> and take this a bit further.. Can we make the hash size be completely
> dynamic? Instead of using another heuristic, can we grow (and shrink?)
> the hash as the workload demands? So if the max chain depth reaches a
> threshold, we increase the number of hash buckets (e.g. double it).
> 
> Of course the details of performing the grow (i.e. rehash) operation
> would need to be worked out so it doesn't affect performance,
> consistency, etc.. But from a theoretical stand point, moving it to be
> sized dynamically seems like a much better solution, IMO.

Several problems with this approach:

1) unpredictability - when do trigger it and by what criteria?
2) the extreme expense of rehashing everything (easily dozens of
   seconds of one CPU pegged at 100% while everything else grinds
   to a halt as the ARC is inaccessible)
3) hard to diagnose latency spikes due to problem #2

The easiest approach is just to give the admin a knob which they can
twiddle at boot and then leave alone. If performance is really a
problem, they can schedule downtime. Doing it at runtime has its
problems and doing it automatically is dangerous.

Cheers,
-- 
Saso
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to