Hi,
I am ashamed to reply so late, sorry, I got lost in other stuff on
Monday. I'll combine my replies:
On Mon, Nov 8, 2010 at 08:17, Zachary Zolton
<[email protected]> wrote:
Of course, you'd be stuck with manually tracking the types of
URLs to
purged, so I haven't been too eager to try it out yet...
Yes, that's precisely what I'd like to avoid. It's not _that_ hard of
course, and Couch provides awesome entry point for the invalidation in
_changes or update_notifier, but still...
On 9.Nov, 2010, at 24:42 , Robert Newson wrote:
I think it's clear that caching via ETag for documents is close to
pointless (the work to find the doc in the b+tree is over 90% of the
work and has to be done for GET or HEAD).
Yes. I wonder if there's any room for improvement on Couch's part. In
any case, when we're in "every 1 request to cache means 1 request to
database" situation, "caching" is truly pointless.
On Mon, Nov 8, 2010 at 11:11 PM, Zachary Zolton <[email protected]
> wrote:
That makes sense: if every request to the caching proxy checks the
etag against CouchDB via a HEAD request—and CouchDB currently does
just as much work for a HEAD as it would for a GET—you're not going
to
see an improvement.
Yes. But that's not the only scenario imaginable. I'd repeat what I
wrote to the Varnish mailing list [http://lists.varnish-cache.org/pipermail/varnish-misc/2010-November/004993.html
]:
1. The cache can "accumulate" requests to a certain resource for a
certain (configurable?) period of time (1 second, 1 minute, ...) and
ask the backend less often -- accelerating througput.
2. The cache can return "possibly stale" content immediately and check
with the backend afterwards (on the background, when n-th next request
comes, ...) -- accelerating response time.
It was my impression, that at least the first option is doable with
Varnish (via some playing with the grace period), but I may be
severely mistaken.
On Mon, Nov 8, 2010 at 5:04 PM, Randall Leeds
<[email protected]> wrote:
If you have a custom caching policy whereby
the proxy will only check the ETag against the authority (Couch)
once
per (hour, day, whatever) then you'll get a speedup. But if your
proxy
performs a HEAD request for every incoming request you will not see
much performance gain.
P-r-e-c-i-s-e-ly. If we can tune Varnish or Squid to not be so "dumb"
and check with the backend based on some configs like this, we could
use it for proper self-invalidating caching. (As opposed to TTL-based
caching, which bring the manual expiration issues discussed above.)
Unfortunately, at least based on the answers I got, this just not
seems to be possible.
On Mon, Nov 8, 2010 at 12:06, Randall Leeds <[email protected]>
wrote
It'd be nice if the "Couch is HTTP and can leverage existing
caches and tools"
talking point truly included significant gains from etag caching.
P-R-E-C-I-S-E-L-Y. This is, for me, the most important, and
embarrassing issue of this discussion. The O'Reilly book has it all
over the place: http://www.google.com/search?q=varnish+OR+squid+site:http://guide.couchdb.org
. Whenever you tell someone who really knows about HTTP caches "Dude,
Couch is HTTP and can leverage existing caches and tools" you can and
will be laughed at -- you can get away with mentioning expiration
based caching and "simple" invalidation via _changes and such, but...
Embarrassing still.
I'll try to do more research in this area, when time permits. I don't
believe there's _not_ some arcane Varnish config option to squeeze
some performance eg. in the "highly concurrent requests" scenario.
Thanks for all the replies!,
Karel