On Thu, 2009-03-12 at 12:57 -0700, IanR wrote: > I'm processing RSS content from a # of given sources. Most of the > time the url given by the RSS feed redirects to the real URL (I'm > guessing they do this for tracking purposes) > > For example. > > This is a url that I get from and RSS feed, > http://www.pheedcontent.com/click.phdo?i=d22e9bc7641aab8a0566526f61806512 > It redirects to > http://www.macsimumnews.com/index.php/archive/klipsch_developing_headphones_for_new_ipod_shuffle/ > > I want to record the final URL and not the URL I get from the RSS feed > (However sometimes there is no redirect so I might want the original > URL) > > I've tried sniffing the header and don't see any "Location:"... I > think sites are using different ways to redirect. Does anyone have > any suggestions on how I might handle this?
If you are using urllib[2]: >>> url = 'http://www.pheedcontent.com/click.phdo?i=d22e9bc7641aab8a0566526f61806512' >>> o = urllib2.urlopen(url) >>> o.url 'http://www.macsimumnews.com/index.php/archive/klipsch_developing_headphones_for_new_ipod_shuffle/' -- http://mail.python.org/mailman/listinfo/python-list