Issue 80 in html5lib: TypeError when serializing some pages to BeautifulSoup

codesite-noreply Mon, 23 Mar 2009 13:27:25 -0700


Comment #5 on issue 80 by nikolay.panov: TypeError when serializing some  
pages to BeautifulSoup
http://code.google.com/p/html5lib/issues/detail?id=80


Well, lxml treebuider represent this "<a><div><div><a>" as:
>>> e =
html5lib.HTMLParser(tree=html5lib.treebuilders.getTreeBuilder("lxml")).parse("<a><div><div><a>")
>>> etree.tostring(e)
'<html><head/><body><a/><div><a></a><div><a></a><a/></div></div></body></html>'

This representation also looks not very well.
Without html5lib, this string parsed by lxml as
"<html><body><a><div><div><a/></div></div></a></body></html>".

By BeautifulSoup:
>>> BeautifulSoup.BeautifulSoup("<a><div><div><a>")
<a><div><div></div></div></a><a></a>



--
You received this message because you are listed in the owner
or CC fields of this issue, or because you starred this issue.
You may adjust your issue notification preferences at:
http://code.google.com/hosting/settings

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"html5lib-discuss" group.
 To post to this group, send email to [email protected]
 To unsubscribe from this group, send email to 
[email protected]
 For more options, visit this group at 
http://groups.google.com/group/html5lib-discuss?hl=en-GB
-~----------~----~----~----~------~----~------~--~---

Issue 80 in html5lib: TypeError when serializing some pages to BeautifulSoup

Reply via email to