sophie_newbie wrote: > Hi, I'm wondering how i'd go about extracting a string array of all > comments in a HTML file, HTML comments obviously taking the format > "<!-- Comment text here -->". > > I'm fairly stumped on how to do this? Maybe using regular expressions?
from lxml import etree parser = etree.HTMLParser() tree = etree.parse("somefile.html", parser) print tree.xpath("//comment()") http://codespeak.net/lxml Stefan -- http://mail.python.org/mailman/listinfo/python-list