I think it's clear that caching via ETag for documents is close to pointless (the work to find the doc in the b+tree is over 90% of the work and has to be done for GET or HEAD).
Where there should be a boost is caching of attachments, since couch doesn't have to fetch one byte of the actual binary in the case of an ETag match, and caching of view query results, as couch knows the view on disk hasn't changed, and so can return a 304 rather than recompute the result (which is where stale=ok becomes your friend). B. On Mon, Nov 8, 2010 at 11:11 PM, Zachary Zolton <[email protected]> wrote: > That makes sense: if every request to the caching proxy checks the > etag against CouchDB via a HEAD request—and CouchDB currently does > just as much work for a HEAD as it would for a GET—you're not going to > see an improvement. > > On Mon, Nov 8, 2010 at 5:04 PM, Randall Leeds <[email protected]> wrote: >> I should be more clear. If you have a custom caching policy whereby >> the proxy will only check the ETag against the authority (Couch) once >> per (hour, day, whatever) then you'll get a speedup. But if your proxy >> performs a HEAD request for every incoming request you will not see >> much performance gain. >> >> On Mon, Nov 8, 2010 at 12:06, Randall Leeds <[email protected]> wrote: >>> As I mentioned on another thread, etags only save you bandwidth as >>> right now Couch performs the GET request and then discards the body. >>> I'll open a JIRA ticket for this if it's not there already. It'd be >>> nice if the "Couch is HTTP and can leverage existing caches and tools" >>> talking point truly included significant gains from etag caching. >>> >>> On Mon, Nov 8, 2010 at 08:17, Zachary Zolton <[email protected]> >>> wrote: >>>> Drat! If only Varnish supported Etags... >>>> >>>> If you don't wanna use time-based expiry, you could probably craft a >>>> custom-built solution where you watch the _changes feed and explicitly >>>> purge URLs using a tool such as Thinner: >>>> >>>> http://propublica.github.com/thinner/ >>>> >>>> Of course, you'd be stuck with manually tracking the types of URLs to >>>> purged, so I haven't been too eager to try it out yet... >>>> >>>> —Zach >>>> >>>> On Sun, Nov 7, 2010 at 1:22 PM, Adam Kocoloski <[email protected]> wrote: >>>>> Hi Karel, the last time I looked into this I came to the same conclusions >>>>> as you have here. Regards, >>>>> >>>>> Adam >>>>> >>>>> On Nov 7, 2010, at 5:28 AM, Karel Minařík wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> I'd like to ask if anyone has some experience to share regarding >>>>>> accelerating Couch with Varnish. I think lots of us are doing it, but >>>>>> can't find too much info around. >>>>>> >>>>>> Originally, I thought it would be possible to use ETags with some proper >>>>>> Varnish configuration (eg. "accumulate" concurrent requests and pass >>>>>> only one to the backend, etc), but that seems not to be possible, since >>>>>> Varnish does not pass ETags to the backend >>>>>> [http://lists.varnish-cache.org/pipermail/varnish-misc/2010-November/004997.html]. >>>>>> >>>>>> As I understand it now, the only way how to cache Couch's response would >>>>>> be with time-based caching, and either using the cached response until >>>>>> it auto-expires, or expire the cached response via PURGE commands. >>>>>> >>>>>> Of course, it would be possible and technically trivial to send purge >>>>>> requests via the _changes feed or via the "update_notification" >>>>>> mechanism. As I see it, the tricky part would be to know which objects >>>>>> to purge, based on individual document changes. Because not only single >>>>>> documents, but also aggregated view results or fulltext queries would >>>>>> get cached. Of course, "there are two hard thing in computer science >>>>>> ...". >>>>>> >>>>>> Has anyone put any thoughts/work into this? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Karel >>>>>> >>>>> >>>>> >>>> >>> >> >
