On Tue, Apr 26, 2011 at 5:25 PM, Mark Moseley <[email protected]> wrote: > I was working on something in my quest to keep big (eventually > uncacheable) objects from wreaking havoc on my cache. Even if I employ > a scheme to call "restart" from vcl_fetch, after adding a header that > tells vcl_recv to call 'pipe', the object still gets fetched from the > origin server. And if it's 1.5 gig, it can be pretty painful. > > So I was hoping to throw this by you guys, esp the Varnish devs. > Mainly I wanted to hear if anyone thought this was a tremendously bad > idea. I wrote this about 45 minutes ago, so it's not particularly > well-tested out, but if you guys said this was the worst idea ever, > then I might reconsider putting a lot more time into perfecting it. > Thus there are likely to be big corner cases here. There was another > recent thread about this subject, so I know there are some other > people looking for a similar solution, so I thought I'd throw this out > there too. This doesn't protect me from 1.5 gig JPEG files but it does > most of the job. and a further comment is that, yes, I'm ok with all > the extra backend reqs, providing their HEADs. > > Mainly what it's doing is this: > > 1. Huge files won't ever be HITs in my environment, since I'll have piped > them. > 2. If a MISS (as it should be), rewrite backend method from GET (I > don't do POSTs on varnish) to HEAD in vcl_miss if it's a file > extension likely to be a biggish file and matches other conditions. > 3. In vcl_fetch, if it's a rewritten HEAD, do size check. If it's too > big, add the header that indicates to vcl_fetch to drop immediately to > 'pipe' > 4. In either case, in vcl_fetch, rewrite the method back to GET and > call 'restart'. > > > Here's the essence of the VCL (imagine regularly-working VCL alongside > it). I typed this out so ignore dumb typos: > > sub vcl_fetch { > .... > # If we've got the header that says to pipe this request, pipe it > (thanks Tollef) > if ( req.http.X-PIPEME && req.restarts > 0 ) { > return( pipe ); > } > .... > } > > > # The URLs in this regex are some sample ones that are often huge in > size; the eventual list would be bigger and have others like 'mpg' > etc. Note that I don't send POSTs over varnish, so ignore lack of POST > sub vcl_miss { > # If no headcheck header and GET and type is on big list, > rewrite to HEAD > if ( ! req.http.X-HEADCHECK && bereq.request == "GET" && > req.url ~ "\.(gz|wmv|zip|flv|avi)$" && req.restarts == 0 ) { > set req.http.X-HEADCHECK = "1"; > set bereq.request = "HEAD"; > set bereq.http.User-Agent = "HEAD Check"; > log "DEBUG: Rewriting to HEAD"; > } > } > > > > sub vcl_fetch { > # If this used to be a GET request that we changed to HEAD, do > length check. But try to avoid restart loops. > if ( req.http.X-HEADCHECK && req.request == "GET" && > bereq.request == "HEAD" && req.url ~ "\.(gz|wmv|zip|flv|avi)$" && > req.restarts < 1) { > unset req.http.X-HEADCHECK; > set bereq.request = "GET"; > log "DEBUG: [fetch] Rewriting to HEAD"; > > # If content is over 10 meg, pipe it > if ( beresp.http.Content-Length ~ "[0-9]{8,}" ) { > set req.http.X-PIPEME = "1"; > } > > restart; > } > .... > } > > > > Mainly I'm just looking for whether the Varnish devs think that this > would cause something to completely explode and/or melt down or this > is the worst security hole ever. It seems to work ok so far. For reqs > that match 'beresp.http.Content-Length ~ "[0-9]{8,}"', the "SMA bytes > allocated" counter never budges, where it normally does for anything > fetched (memory backend). > > Thanks! Hope someone else can benefit from this too. If someone else > uses this (after thorough testing), be sure to remove the 'log' calls > in production. >
Just to update: Works great so far. Prior to this, I was hitting that stevedore.c error on lots of my boxes after a few days of uptime (thanks to customers with gigantic files). Since I rolled this out, most of my boxes' varnishd's now have uptimes from when I deployed this solution across the board about 2 weeks ago. If you try it yourself, watch for loops. _______________________________________________ varnish-misc mailing list [email protected] http://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
