On Thu, 2004-10-21 at 13:04 +1000, Brian May wrote: > * No thought put into the file deletion algorithm. IMHO, deleting > files based on age is wrong (consider how long stable files > last). Deleting files based on number of different copies is also > wrong (consider if you have some systems setup with stable and another > is unstable). IMHO, the only correct way is to scan the most recently > downloaded Packages and Source index files and delete files that > aren't mentioned anymore.
That's how apt-cacher does it. Early versions of apt-cacher did no cache cleaning and it was the #1 requested feature for a while, but once I sat down to actually start implementing it I discovered something that's not obvious until you actually try to do it yourself: Writing Cache Expiry Algorithms Is Bloody Hard(TM). In the end I settled on a combination: Packages and Release files are expired based on age, and .debs are purged based on reference within a Packages file. However, that's not a 100% solution either because what happens if several days go by without any clients doing an 'apt-get update'? The Packages file is purged by the cache cleaning script because it's too old, but then all the .debs are purged too because there's no matching Packages file! Doh. So it's necessary to keep fetching the Packages files within their expiry time or the cache gets nuked. > If you want a reliable caching service, I think some thought needs to > be put into some of the issues above. Some issues might be easy to > fix, others might be harder (e.g. minimizing latency so the client > doesn't time out and to minimize download time but choosing the best > server at the same time). I haven't looked at it them for this purpose in detail but I still think p2p systems are a natural for this. Layering .deb package retrieval onto the Torrent or similar would rock. I'm sure others know much more about the issues though. > You mean via HTTP? This would be possible to add, I think. I guess it > hasn't been considered a priority. Not necessarily, it depends on the cache architecture. Trying to do this with apt-cacher, for example, would suck mightily because it uses a flat cache structure. What's really needed to make is trivially browseable is a cache that stores objects in a structure that mimics the original mirror structure. My understanding is that apt-proxy v2 was written with this in mind, but as usual I'm probably wrong. Cheers :-) Jonathan Oxer -- The Debian Universe: Installing, managing and using Debian GNU/Linux http://www.debianuniverse.com/