I started a library that tools can use to select a PyPI mirror, see https://svn.python.org/packages/trunk/pypi/tools/mirrorlib.py
It currently only deals with mirror selection, but will be extended to deal with mirror validation and key rollover as well. For mirror selection, the objective is to find a mirror that is both fast and current. 1. The caller specifies a maximum acceptable response time for an HTTP request (to /last-modified), and a maximum acceptable age. The caller can also specify whether a.pypi.python.org should be included in the scan or not. Finally, the caller can specify a timeout for slow mirrors. 2. The library contacts the mirrors in order, interleaving DNS lookups with connecting to the mirrors whose IP addresses have been computed. No threads are created in that process. 3. If a mirror is found that meets the requirements, it is returned; this might mean that not all mirrors have been contacted. 4. If no mirror is found that meets the requirements, it contacts all mirrors. When the slow-mirrors timeout has passed, the youngest of all responding mirrors is returned. 5. If all mirrors are slow, the first one responding is returned. 6. If none respond, ValueError is raised (which will happen after the TCP connection timeout). For specific parameters, I found the following defaults useful: - the acceptable mirror age defaults to 30min. Within this time, all mirrors should have synchronized, otherwise, they are considered down. The only exception is when the central mirror is down, then the mirrors will all age. - the acceptable response time defaults to 1s - all mirrors should be able to respond within this time, and it will then use the one that responds first, in enumeration order. Specifying 0.1s might also be useful in some applications; this will rule out slower mirrors (in particular, GAE). - the slow mirrors timeout defaults to 5s. If the master is down, and some mirror is slow, this will be the time until selection completes (with the mirror that claims to have the latest copy). - OTOH, if the master is down, and all mirrors respond to the TCP connect quickly (either accepting or refusing the connection), then it will quickly pick the newest mirror. If there are any questions, feel free to ask. Regards, Martin _______________________________________________ Catalog-SIG mailing list [email protected] http://mail.python.org/mailman/listinfo/catalog-sig
