Hi all, I've been braindumping my thoughts into the Lusca blog during some experimental development to eliminate the data copy in the disk store read path. This shows up as the number 1 CPU abuser in my test CDN deployment - where I see a 99% hit rate on a set of large objects (> 16meg.)
My first idea was to avoid having to paper over the storage code shortcomings with refcounted buffers, and modify various bits of code to keep the store supplied read buffer around until the completion of said read IO. This mirrors the requirements for various other underlying async io implementations such as posix AIO and windows completion IO. Unfortunately the store layer and the async IO code doesn't handle event cancellation right (ie, you can't do it) but the temporary read buffer in async_io.c + the callback data pointer check papers over that. Store reads and writes may be scheduled and in flight when some other part of code calls storeClose() and nothing really tries to wait around for the read IO to complete. So either the store layer needs to be made slightly more sane (which I may attempt later), or the whole mess can stay a mess and be papered over by abusing refcounted buffers all the way down to the IO layer. Anyway, I know there are other developers out there working on filesystem code for Squid-3 and I'm reasonably certain (read: at last check a few months ago) the store layer and IO layers are just as grimey - so hopefully my braindumping will save some more of you a whole lot of headache. :) Adrian