On Fri, Jan 02, 2004 at 03:44:02AM +0000, Colin Watson wrote: > On Thu, Jan 01, 2004 at 06:52:53PM -0800, Matt Zimmerman wrote: > > A cache which serves stale data is a broken cache. I think that apt > > is within its rights to expect a consistent view from the world. You > > would see other failures if you got mismatched versions of Release and > > Release.gpg. > > Without perfect expiration data from the server, HTTP caches can't > fulfil this criterion, otherwise they always need to contact the server > and therefore can't properly fulfil their purpose as caches. What if > somebody had manually fetched Packages.gz ten minutes before the mirror > sync, but Release was uncached so the cache had to fetch it for the apt > run after the mirror sync?
Their only purpose as caches is to improve performance; there's no rule that they aren't allowed to contact the server to ensure that their response is fresh enough. > In fact, from the cache's point of view the files are probably *not* > stale. A quick check on ftp.uk.debian.org, for instance, confirms that we > don't send any Expires or Cache-Control headers (even if we did, they > couldn't be 100% accurate). Therefore the cache's only possible freshness > criteria are based on heuristic expiration guesses, and those are unlikely > to be tight enough to avoid occasional failures. Since Release and Packages/Sources are generated at close to the same time, they should always be about the same age, and the heuristics should apply to them equally. Inconsistent Packages/Sources/Release really ought to be a rather infrequent case. > A request with If-Modified-Since returns 304 Not Modified (and therefore > no message-body) if the entity has not been modified, so you can only use > this for files you already have. The caching problems above may well be > caused by requests from multiple systems, so Cache-Control would be a more APT almost always has old copies of the indexes that it needs, so IMS is very efficient. You are correct that it cannot guarantee consistency, but I believe that it would have prevented the problem in this case. > Of course, Cache-Control requires you to make at least one uncontrolled > request first (or perhaps max-age=0 or no-cache?) in order that you know > the server's date, otherwise you run into clock synchronization > problems. It *is* possible to implement "get me file B and make sure > it's at least as new as file A" as an HTTP client, though, and if you > make sure your first file is genuinely fresh then the problem is solved > unless you're actually in a mirror sync at the time, in which case you > can just try again later. Hmm, APT's http method already sends Cache-Control: max-age for Packages and Sources, just not for Release. I can fix that. -- - mdz

