On Mon, 15 Aug 2011 23:17:55 +0200, Henrik Nordström wrote:
mån 2011-08-15 klockan 09:50 -0600 skrev Alex Rousskov:

I do not like aborted retrievals as the default method of handling a
digest-based hit. Aborted transactions have negative side-effects and
some of those effects are not controlled by Squid (e.g., monitoring
software may trigger an alert if too many requests are aborted).

I agree that we can switch from entities to instances, provided we are
OK with excluding 206, 302, and similar non-200 responses from the
optimization. By instance definition, Squid would not be able to compute
or use an instance digest if the response is not 200 OK. We can hope
that the vast majority of non-200 responses are either not cachable or
are very small and not worth optimizing.

The bulk bandwidth where you would find duplicates is in positive GET
responses.

Not being able to support 206 duplicate detection without caching the
full 200 in the "topmost" cache is a little annoying however.

> In requests you can optionally add an digest based condition similar to > If-None-Match but here If-None-Match already serves the purpose quite > well, so use of the digest condition should probably be limited to cases
> where there is no ETag.

Or to cases where ETag lies about response content changes.

True, but I kind of doubt there is much bandwidth to be found in those
cases.

> To optimize bandwidth loss due to unneeded transmission a slow start > mechanism can be used where the sending part waits a couple RTTs before > starting to transmit the body of a large response where an instance > digest is presented. This allows the receiving end to check the received > instance digest and abort the request if not interested in receiving the
> body.

Besides my general dislike for aborted transactions becoming a norm (see above), "a couple RTT" delay is a high price to pay because each RTT is
a few seconds already.

Seconds? What kind of network is this?

Satellite (long distance), submarine radio (long wave, low bitrate), or ad-hoc ground relay (multiple long distance IP hops).

The RTT details on latter two are mostly classified. But GEO-sync satellites are publicly documented. A single ground-satellite-ground loop can have close to 1sec RTT at the IP level. With complications such as triangular routing with a ground-ground uplink that only gets worse.

Also satellites with routers aboard were due to go up sometime over the last year, so they might also get ground-satellite-satellite-ground loops now. I'm not sure what the real numbers are there, but the early days there were things like 2-3 seconds RTT discussed. Mostly due to low-power requirements, send/receive context switching (!!), or buffer bloat on queuing to cope with bitrates. So reasonable to expect there are at least some with really crap performance.

Amos

Reply via email to