Issue 113 in html5lib: cannot handle mailformed attribute names with html5lib and lxml

codesite-noreply Tue, 27 Oct 2009 14:29:34 -0700


Comment #1 on issue 113 by eromirou: cannot handle mailformed attribute  
names with html5lib and lxml
http://code.google.com/p/html5lib/issues/detail?id=113


I found out that using 'sanitizer.HTMLSanitizer' as the tokenizer works  
fine:

import html5lib
from html5lib import treebuilders
from html5lib import sanitizer

html_code = "<a 123=456></a>"
html_parser = html5lib.HTMLParser(tree=treebuilders.getTreeBuilder('lxml'),
tokenizer=sanitizer.HTMLSanitizer)
print(html_parser.parse(html_code))


I don't know if this is the right way to do it, but it saved me.

Cheers

--
You received this message because you are listed in the owner
or CC fields of this issue, or because you starred this issue.
You may adjust your issue notification preferences at:
http://code.google.com/hosting/settings

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"html5lib-discuss" group.
 To post to this group, send email to [email protected]
 To unsubscribe from this group, send email to 
[email protected]
 For more options, visit this group at 
http://groups.google.com/group/html5lib-discuss?hl=en-GB
-~----------~----~----~----~------~----~------~--~---

Issue 113 in html5lib: cannot handle mailformed attribute names with html5lib and lxml

Reply via email to