Daniel Bromberg <[EMAIL PROTECTED]> writes:
> I remember reading a paper about this. I wish I could remember which
> one. Basically, I think the code path to look up this information is
> very long and unoptimized (no hints).
It was probably stolarchuk's old paper, cited earlier in this thread.
AFS has changed a lot since then (1991?). Most significantly, the cache
is now integrated with the VM subsystem. Someone suggested that the
difference between observed performance and the paper's claims must be
a statistical anomaly. More likely, it's evidence of performance
improvements subsequent to the paper's publication (and not entirely
coincidental...)
> It beats the heck out of me why AFS can't have O(1) cache lookups.
> Hashing based on unique identifiers (AFS's fid's & chunk offset would
> make a nice one) is something from my undergradudate years. Someone
> made a comment about hardware vs. software, and AFS was limited by
> being all in software. This doesn't really matter when it comes down
> to O(f(n)) calculations. Hardware only improves by a constant
> factor. It's the *algorithm* that matters.
In fact, it does hash on the FID+chunk. Thus, cache lookups don't get
slower with larger caches. The observed degradation with excessively
large caches (and excessive all depends on the application) is probably
related to cache reclamation, which is presently linear with respect to
the cache size. As Joop suggested, tuning for optimum performance is
unfortunately not quite as simple as "make your cache as big as you
can."
At one point, Mickey had good reason to use a 2GB cache, because his
applications demanded it. That would be outrageously large for most
uses.