Re: mod_disk_cache summarization

Davi Arnaut Mon, 23 Oct 2006 13:25:07 -0700

Graham Leggett wrote:
> Brian Akins wrote:
> 
>> Can someone please summarize the various patches for mod_disk_cache that 
>> have been floating around in last couple weeks?  I have looked at the 
>> patches but wasn't real sure of the general philosophy/methodology to them.
> 
> In essence, the patches solve the thundering herd problem.
> 
> This involved some significant changes to the way disk cache worked, 
> some of which are still being reviewed.
> 
> What's been committed so far is a temporary workaround to the "4.7GB 
> file buckets are being loaded into RAM" problem. The workaround detects 
> if the bucket being read is a file bucket, and copies the file into the 
> cache using file read and file write, instead of bucket read (and thus 
> crash).
> 
> Joe Orton raised concern about the targeting of file buckets for special 
> treatment in the disk cache, and I've been trying to find the "right" 
> way to handle this. Looking at the network write filter, it also seemed 
> to follow the logic "if file bucket, special handling (possibly 
> SENDFILE), else read then write". So far the way that seems to most 
> elegant to me is to create a file write filter based on the network 
> write filter, that hides all the special bucket handling magic. Still 
> working on it though.


Have you seen my patch to address this issue ? IMHO, it is far less
complex and less expensive then the committed workaround.

> Next up was the fix for thundering herd itself - this involved teaching 
> the read then write loop inside disk cache to be broken into two 
> possible paths: read from output filter stack and write to cache file, 
> followed by read from cache file and write to network.
> 

..

> 
> The next patch, which was posted by Niklas but not committed yet, solves 
> the problem of the lead reading the entire response, then sending it to 
> the client as mentioned above. It involves disk_cache forking a 
> process/thread as appropriate, splitting the read response / write to 
> cache from read from cache / write to network, which now run 
> independently of each other.
> 
> I am trying to find ways of removing the need for the extra fork / 
> thread. One option is to perform non blocking writes in the read from 
> cache / write to network code, and only iterating to the next write if 
> the previous write is complete (ie no block).
> 
> This means the write to client will always be the same speed or slower 
> than the read from backend, which is exactly what we want.
> 

I didn't like that sleep/fstat hackery. How about using a real file
notification scheme ?

--
Davi Arnaut

Re: mod_disk_cache summarization

Reply via email to