In article <[EMAIL PROTECTED]>, "Miki" <[EMAIL PROTECTED]> wrote:
> Hello Shelton, > > > I am learning Python, and have never worked with HTML. However, I would > > like to write a simple script to audit my 100+ Netware servers via their web > > portal. > Always use the right tool, BeautilfulSoup > (http://www.crummy.com/software/BeautifulSoup/) is best for web > scraping (IMO). > > from urllib import urlopen > from BeautifulSoup import BeautifulSoup > > html = urlopen("http://www.python.org").read() > soup = BeautifulSoup(html) > for link in soup("a"): > print link["href"], "-->", link.contents Agreed. HTML scraping is really complicated once you get into it. It might be interesting to write such a library just for your own satisfaction, but if you want to get something done then use a module that already written, like BeautifulSoup. Another module that will do the same job but works differently (and more simply, IMO) is HTMLData by Connelly Barnes: http://oregonstate.edu/~barnesc/htmldata/ -- Philip http://NikitaTheSpider.com/ Whole-site HTML validation, link checking and more -- http://mail.python.org/mailman/listinfo/python-list