On May 24, 3:26 am, "Reedick, Andrew" <[EMAIL PROTECTED]> wrote:
> c) If you're going to parse html/xml then bite the bullet and learn one
> of the libraries specifically designed to parse html/xml. Many other
> regex gurus have learned this lesson. Myself included. =)
Agreed. The BeautifulSoup approach is particularly nice (although not
part of stdlib):
>>> import urllib
>>> from BeautifulSoup import BeautifulSoup
>>> html = urllib.urlopen('http://www.python.org/').read()
>>> soup = BeautifulSoup(html)
>>> links = [link['href'] for link in soup('link')]
>>> links[0]
u'http://www.python.org/channews.rdf'
- alex23
--
http://mail.python.org/mailman/listinfo/python-list