Re: Issue 211 in html5lib: Awful memory leak / infinite loop

html5lib Fri, 24 Aug 2012 09:33:57 -0700

Comment #2 on issue 211 by jos...@metaoptimize.com: Awful memory leak /infinite loop

http://code.google.com/p/html5lib/issues/detail?id=211


The Markdown comes from the wild and is probably invalid.

My idea was to pass the HTML through tidy before running an HTML parser,thus avoiding an infinite loop. There are several tidy wrappers in Python.I used pytidylib.

I didn't play with the options to make tidy more strict, and even aftertidy, html5lib still goes into an infinite loop. So my current workaroundis to use tidy followed by lxml :\


--
You received this message because you are subscribed to the Google Groups 
"html5lib-discuss" group.
To post to this group, send an email to html5lib-discuss@googlegroups.com.
To unsubscribe from this group, send email to 
html5lib-discuss+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/html5lib-discuss?hl=en-GB.

Re: Issue 211 in html5lib: Awful memory leak / infinite loop

Reply via email to