The other bottleneck I looked at was MD5 as the on-disk naming scheme. I think MD5 is a poor choice here because it's not very fast. Ideally, switching to a variant of the times-33 hash might work out better. *shrug*
How to handle collisions?
MD5 has the possibility for collisions, too. What do squid or other proxies do? On one hand, I think doing MD5 is sort of silly - just use the URL itself. *shrug* -- justin
