Re: Couch and Varnish

Karel Minařík Sat, 13 Nov 2010 08:36:37 -0800

Hi,

I am ashamed to reply so late, sorry, I got lost in other stuff onMonday. I'll combine my replies:

On Mon, Nov 8, 2010 at 08:17, Zachary Zolton<[email protected]> wrote:

Of course, you'd be stuck with manually tracking the types ofURLs to
purged, so I haven't been too eager to try it out yet...

Yes, that's precisely what I'd like to avoid. It's not _that_ hard ofcourse, and Couch provides awesome entry point for the invalidation in_changes or update_notifier, but still...


On 9.Nov, 2010, at 24:42 , Robert Newson wrote:

I think it's clear that caching via ETag for documents is close to
pointless (the work to find the doc in the b+tree is over 90% of the
work and has to be done for GET or HEAD).

Yes. I wonder if there's any room for improvement on Couch's part. Inany case, when we're in "every 1 request to cache means 1 request todatabase" situation, "caching" is truly pointless.

On Mon, Nov 8, 2010 at 11:11 PM, Zachary Zolton <[email protected]> wrote:

That makes sense: if every request to the caching proxy checks the
etag against CouchDB via a HEAD request—and CouchDB currently does
just as much work for a HEAD as it would for a GET—you're not goingto
see an improvement.

Yes. But that's not the only scenario imaginable. I'd repeat what Iwrote to the Varnish mailing list [http://lists.varnish-cache.org/pipermail/varnish-misc/2010-November/004993.html]:1. The cache can "accumulate" requests to a certain resource for acertain (configurable?) period of time (1 second, 1 minute, ...) andask the backend less often -- accelerating througput.2. The cache can return "possibly stale" content immediately and checkwith the backend afterwards (on the background, when n-th next requestcomes, ...) -- accelerating response time.It was my impression, that at least the first option is doable withVarnish (via some playing with the grace period), but I may beseverely mistaken.

On Mon, Nov 8, 2010 at 5:04 PM, Randall Leeds<[email protected]> wrote:

If you have a custom caching policy whereby
the proxy will only check the ETag against the authority (Couch)onceper (hour, day, whatever) then you'll get a speedup. But if yourproxy
performs a HEAD request for every incoming request you will not see
much performance gain.

P-r-e-c-i-s-e-ly. If we can tune Varnish or Squid to not be so "dumb"and check with the backend based on some configs like this, we coulduse it for proper self-invalidating caching. (As opposed to TTL-basedcaching, which bring the manual expiration issues discussed above.)Unfortunately, at least based on the answers I got, this just notseems to be possible.

On Mon, Nov 8, 2010 at 12:06, Randall Leeds <[email protected]>wrote

It'd be nice if the "Couch is HTTP and can leverage existingcaches and tools"
talking point truly included significant gains from etag caching.

P-R-E-C-I-S-E-L-Y. This is, for me, the most important, andembarrassing issue of this discussion. The O'Reilly book has it allover the place: http://www.google.com/search?q=varnish+OR+squid+site:http://guide.couchdb.org. Whenever you tell someone who really knows about HTTP caches "Dude,Couch is HTTP and can leverage existing caches and tools" you can andwill be laughed at -- you can get away with mentioning expirationbased caching and "simple" invalidation via _changes and such, but...Embarrassing still.

I'll try to do more research in this area, when time permits. I don'tbelieve there's _not_ some arcane Varnish config option to squeezesome performance eg. in the "highly concurrent requests" scenario.


Thanks for all the replies!,

Karel

Re: Couch and Varnish

Reply via email to