Hello debconf-video team, I'm currently running the experimental debian-cd download redirector based on mirrorbrain at http://debian-cd.debian.net/, and pabs has asked me to look into using mirrorbrain for the Debian meetings video archive too. I did, and this is what I found so far.
For those who do not know what mirrorbrain is: Mirrorbrain is a download-redirector, whose job is to redirect clients to a suitable mirror whenever they request a file. To do that, it uses GeoIP for the clients IP to determine where that client is, and with that information it locates nearby mirrors. It also scans all the mirrors at regular intervals, so that it knows exactly which mirror has which files, and which mirrors are currently down. That way it (almost) never sends a client to a mirror that doesn't have the particular file or is currently down. Now this could obviously be useful for the video archive which delivers rather large files, where it does make a huge difference whether you fetch a file from a local mirror or from overseas at 1/10th the speed. It might even be an idea to send the meetings-archive.d.n (which AFAIK is where all the official links to the recordings point to) through the redirector instead of pointing it at the master site in sweden. However, there are a few problems with the meetings archive as it currently is that would reduce the effectiveness of the redirector: 1. the redirector needs to know which files are on which mirror, and for that it needs to scan/index the mirrors. It can do that through rsync, ftp, or as a last resort, through HTTP. For that to work, the HTTP directory listings must be in a format parseable by Mirrorbrain. From the current mirrors, the amazonaws one does not have parseable directory listings, so cannot be scanned and thus cannot be used by the redirector. 2. Even for the mirrors that can in principle be scanned, there are directories that cannot be scanned, e.g. /2006/debconf6 [1]. The reason is that those dirs contains an index.html that displays a nice explanation of what each file is, and apache delivers that instead of the usual directory listings. As nice as that is for human users, it means that mirrorbrain doesn't see a directory listing that it can parse in that directory, so it will record "this mirror doesn't have any files in that dir". Requests for files in those dirs will thus always be redirected to the "fallback" which is the master in sweden. There are two ways to fix that: Either the mirrors need to provide options to scan them other than HTTP, e.g. rsync or FTP; Or (suggested by pabs) the index.html could be renamed to README.html. Apache would still display this as a footer below the directory listing. 3. Too few mirrors. There are currently only 3 working mirrors and the main site in sweden that could be fed into mirrorbrain. 3 of those are in Europe, 1 is in Taiwan. Mirrorbrain unfortunately cannot do miracles. It would send all requests from Asia to the .tw mirror (if it has the requested file and is up of course). It would also send all requests from Europe to an european mirror, and to the one in the same country if there is one. For all other parts of the world however, including the US, it would just pick a random mirror in Europe because it doesn't have anything really suitable. The mirrorbrain instance for the meetings-archive is currently up at http://debian-meetings.poempelfox.de/debian-meetings/, but that will not be a permanent URL. If you decide that this would be useful, I would be willing to run it for the forseeable future under a debian.net domain (perhaps even meetings-archive.d.n?), together with the debian-cd.debian.net. Otherwise it will disappear soon. [1] http://meetings-archive.debian.net/pub/debian-meetings/2006/debconf6/ _______________________________________________ Debconf-video mailing list [email protected] http://lists.debconf.org/mailman/listinfo/debconf-video
