sophie_newbie wrote:
> Hi, I'm wondering how i'd go about extracting a string array of all
> comments in a HTML file, HTML comments obviously taking the format
> "<!-- Comment text here -->".
> 
> I'm fairly stumped on how to do this? Maybe using regular expressions?


   from lxml import etree

   parser = etree.HTMLParser()
   tree = etree.parse("somefile.html", parser)

   print tree.xpath("//comment()")


http://codespeak.net/lxml

Stefan
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to