The syntax that browsers understand as HTML comments is much less restrictive than what BeautifulSoup understands. I keep running into sites with formally incorrect HTML comments which are parsed happily by browsers. Here's yet another example, this one from "http://www.webdirectory.com". The page starts like this:
<!Hello there! Welcome to The Environment Directory!> <!Not too much exciting HTML code here but it does the job! > <!See ya, - JD > <HTML><HEAD> <TITLE>Environment Web Directory</TITLE> Those are, of course, invalid HTML comments. But Firefox, IE, etc. handle them without problems. BeautifulSoup can't parse this page usefully at all. It treats the entire page as a text chunk. It's actually HTMLParser that parses comments, so this is really an HTMLParser level problem. John Nagle -- http://mail.python.org/mailman/listinfo/python-list