At 03:56 PM 7/11/2006 -0400, Jim Fulton wrote: >On Jul 11, 2006, at 2:07 PM, Phillip J. Eby wrote: >>At 11:50 AM 7/11/2006 -0400, Jim Fulton wrote: >>>I would stop when a result is found. >> >>Even so, this means O(N x M) web hits, where N is the number of >>packages and M is the number of --find-links (including dependency >>links supplied by eggs installed so far). I don't think it's >>reasonable to hit so many non-existent URLs on non-index servers, >>and is impolite to the servers' operators. (For example, if they >>receive a daily report of all 404 errors from their web servers, as >>I do. This is pretty common on Red Hat boxes using logwatch, for >>example.) >> >>It's particularly unfair since using e.g. http:// >>peak.telecommunity.com/snapshots/ as a --find-links while >>installing, say TurboGears, would cause a whole host of "index" >>hits to subdirectories of that URL, even though none of them can or >>will be found. >> >>The fallout from this approach is far worse than any "screen >>scraping" issues we've had. > >Isn't this the approach that's followed now?
No; only the --find-links pages themselves are read, and one assumes that they actually exist. :) > Aren't all of the find- links searched as well as the index? I suppose > you're referring to >the search for /projectname, which potentially doubles the number of >requests. Doubling is only the beginning. If there are 5 dependencies, or 5 requirements on the command line, then it quintuples the number of requests, and they're all going to be retrieving non-existent URLs, except for whichever link was actually the package index. Of course, this is also ignoring the UI reason why the index URL and find-links URLs are specified separately, and that is that the common case is to use PyPI and maybe also a find-link or two. If they were specified by the same option, then any use of find-links would require you to retype the index URL. So, it's not a very convenient UI to merge the concepts, as well as being neither efficient for retrieval speed nor polite to site operators. _______________________________________________ Distutils-SIG maillist - [email protected] http://mail.python.org/mailman/listinfo/distutils-sig
