- Original Message -
From: Mark R. Diggory [EMAIL PROTECTED]
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Wednesday, July 14, 2004 8:48 AM
Subject: ASF Repository, closer.cgi and Depot
Sorry for the cross post but this seems relevant to both these groups.
I was thinking about the subject of mirroring and redirection for the
ASF Repository. Currently, there was some discussion on the Depot list
concerning this. I feel we could address this subject again for both
groups interest.
www.apache.org/dyn/closer cgi provides a simple resolution strategy to
attempt to determine the closest mirror available to the client browser.
It then generates an html page via a template that lists the selected
mirror as well as other available mirrors. With Depot, we have a
customized download client that could be extended to manage downloading
from a list of mirrors as well.
Here are my thoughts on this subject:
A.) This script is really not that big (90% of it is just parsing the
mirrors file), and the database (a flat text file called mirrors.list)
as well is not very big. While closer.cgi is a neat service for
browsers. Its not exactly helpful for automated clients. Yet,
mirrors.list is an excellent example of metadata that is exposed in a
effective manner such that automated clients can access it.
http://www.apache.org/mirrors/mirrors.list
I'm somewhat convinced that a it would be simple to create a client
implementation which accomplished the same functionality as closer.cgi
programatically so that it could be used in terms of resolving a
location to download from when mirrors are available.
This would be beneficial to the Apache Bandwidth issue in that if a
client such as Depot/DownloadManager managed the same capability as
closer.cgi then:
Hmm, it seems to me that infra@ or mirrors@ (is that a list) probably have
views on this. (But then, we probably don't want 4 lists on here. :-) I
suspect their views would include what you suggest, that distribution might
save some nomimal (c.f. artifact sizes) bandwidth savings give some CPU
saving, but it'd be at significant loss of 'control' (of well behaved
clients). Central control over this seems the most appealing.
Since I doubt the CPU cycles are worth saving (or the script would've been
optimised), could we not just change the script to check for some header
from the client, and return XML or some structured text, for non-human
browsers. [BTW: viewcvs seems to do this nicely, returning the file if
non-human and the presentation is human (as browser identifies).
regards,
Adam