Re: [Distutils] Sourceforge mirrors, again

2006-09-19 Thread Eric S. Johansson
Phillip J. Eby wrote:
 
 Anybody have any thoughts on this?

I wrote a bit of code for my raging dormouse downloader which follows 
links on the sourceforge pages until you get to the real data.

would that be helpful?

---eric

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Sourceforge mirrors, again

2006-09-19 Thread Kevin Dangoor
On Sep 19, 2006, at 11:51 AM, Phillip J. Eby wrote:

 Well, it looks like Sourceforge has found yet another way to mess with
 easy_install's ability to download from their mirrors.  : 
 (  Specifically,
 they are not keeping the dl.sourceforge.net A list up-to-date, so
 easy_install's attempts to just use simple round-robin DNS aren't  
 always
 working.  Several IPs in the round robin A list are not  
 responding, and
 some new mirrors haven't been added to it.  At this rate, the current
 approach will become unusable in a relatively short timeframe.  :(

 It seems as though there is no way to auto-discover the mirrors  
 themselves;
 I had hoped that perhaps a zone transfer on the dl.sourceforge.net  
 zone
 might work to obtain a list of the actual mirrors, but I haven't  
 been able
 to successfully obtain one.

 What I'm wondering at this point is if perhaps the only sane thing  
 to do is
 to publish our own mirror list via DNS, so that at least when  
 there's a
 problem it can still be fixed.  The idea would be to replace  
 easy_install's
 current DNS lookup of 'dl.sourceforge.net' IPs with something like
 'dl.sfmirrors.telecommunity.com' (for example).

That seems like a good idea. One other possibility is if we want to  
future-proof against changes to SF.net's download pages, there could  
be a web service that figures out where to send the user to a file.  
That way, only that service needs to change if they change their  
download page format. That is, of course, a lot heavier of a solution.

Doing the DNS change seems like a good idea.

Kevin
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] Sourceforge mirrors, again

2006-09-19 Thread Phillip J. Eby
At 04:49 PM 9/19/2006 -0400, Kevin Dangoor wrote:
On Sep 19, 2006, at 11:51 AM, Phillip J. Eby wrote:

  Well, it looks like Sourceforge has found yet another way to mess with
  easy_install's ability to download from their mirrors.  :
  (  Specifically,
  they are not keeping the dl.sourceforge.net A list up-to-date, so
  easy_install's attempts to just use simple round-robin DNS aren't
  always
  working.  Several IPs in the round robin A list are not
  responding, and
  some new mirrors haven't been added to it.  At this rate, the current
  approach will become unusable in a relatively short timeframe.  :(
 
  It seems as though there is no way to auto-discover the mirrors
  themselves;
  I had hoped that perhaps a zone transfer on the dl.sourceforge.net
  zone
  might work to obtain a list of the actual mirrors, but I haven't
  been able
  to successfully obtain one.
 
  What I'm wondering at this point is if perhaps the only sane thing
  to do is
  to publish our own mirror list via DNS, so that at least when
  there's a
  problem it can still be fixed.  The idea would be to replace
  easy_install's
  current DNS lookup of 'dl.sourceforge.net' IPs with something like
  'dl.sfmirrors.telecommunity.com' (for example).

That seems like a good idea. One other possibility is if we want to
future-proof against changes to SF.net's download pages, there could
be a web service that figures out where to send the user to a file.
That way, only that service needs to change if they change their
download page format. That is, of course, a lot heavier of a solution.

Doing the DNS change seems like a good idea.

I've implemented a proof of concept as 'sf-mirrors.telecommunity.com', with 
a cron job that scrapes the mirror names via HTTP and then updates the zone 
file.  For the moment, it's set up to automatically halt if there's any 
change in the mirror names or the number of mirrors, so I can make sure the 
change isn't due to SF changing their UI again.  If there are no changes 
and the script is successful in pulling the current IPs for the named 
mirrors, it updates the zone file.

Anybody want to give it a try?  Just change all references to 
'dl.sourceforge.net' in setuptools/package_index.py with references to 
'sf-mirrors.telecommunity.com'.

I'm not sure what I think about it, exactly.  One issue is that it makes it 
look like the software is phoning home to me, or that downloads are 
coming from my servers, even though they are unrelated.  It's also possible 
that some mirrors might freak when they receive a 'Host:' header that 
points to telecommunity.com!

So, I'm not 100% sure this can work reliably yet.  Maybe it would be better 
to just encourage SF to fix their broken DNS records.  :(

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig