Re: creat a DOM from an html document

Paul Boddie Thu, 09 Feb 2006 13:55:44 -0800

John J. Lee wrote:
> Mark Harrison <[EMAIL PROTECTED]> writes:
>
> > Ahh, it's BeautifulSoup...
>
> Strictly that's not THE DOM, just A document object model.  The DOM
> proper is a standardised interface, which BeautifulSoup does not
> implement.  You could build a DOM using BeautifulSoup, though.


For a certain value of standardised, libxml2dom provides "the DOM" for
HTML:

import urllib, libxml2dom
f = urllib.urlopen("http://www.python.org";)
s = f.read(); f.close()
d = libxml2dom.parseString(s, html=1)
print "There are", len(d.xpath("//table")), "tables in the document."

See http://www.python.org/pypi/libxml2dom for more information.

Paul

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: creat a DOM from an html document

Reply via email to