On Dec 22, 4:24 pm, "Colin J. Williams" <cjwilliam...@gmail.com> wrote: > On 21-Dec-10 12:22 PM, Jon Clements wrote: > > > import lxml > > from urlparse import urlsplit > > > doc = lxml.html.parse('http://www.google.com') > > print map(urlsplit, doc.xpath('//a/@href')) > > > [SplitResult(scheme='http', netloc='www.google.co.uk', path='/imghp', > > query='hl=en&tab=wi', fragment=''), SplitResult(scheme='http', > > netloc='video.google.co.uk', path='/', query='hl=en&tab=wv', > > fragment=''), SplitResult(scheme='http', netloc='maps.google.co.uk', > > path='/maps', query='hl=en&tab=wl', fragment=''), > > SplitResult(scheme='http', netloc='news.google.co.uk', path='/nwshp', > > query='hl=en&tab=wn', fragment=''), ...] > > Jon, > > What version of Python was used to run this? > > Colin W.
2.6.5 - the lxml library is not a standard module though and needs to be installed. -- http://mail.python.org/mailman/listinfo/python-list