On May 24, 3:26 am, "Reedick, Andrew" <[EMAIL PROTECTED]> wrote: > c) If you're going to parse html/xml then bite the bullet and learn one > of the libraries specifically designed to parse html/xml. Many other > regex gurus have learned this lesson. Myself included. =)
Agreed. The BeautifulSoup approach is particularly nice (although not part of stdlib): >>> import urllib >>> from BeautifulSoup import BeautifulSoup >>> html = urllib.urlopen('http://www.python.org/').read() >>> soup = BeautifulSoup(html) >>> links = [link['href'] for link in soup('link')] >>> links[0] u'http://www.python.org/channews.rdf' - alex23 -- http://mail.python.org/mailman/listinfo/python-list