Comment #1 on issue 211 by way...@gmail.com: Awful memory leak / infinite loop
http://code.google.com/p/html5lib/issues/detail?id=211
I can't comment on the infinite loop, but as the maintainer of the Markdown library, I was concerned regarding the original reporter's implication that Markdown may be producing invalid HTML. While only the output is provided, not the input, it appears to me that the invalid output is a result of invalid input. You should be wrapping those random angle-bracket tags in code tags. So "(`<button>` and `<a>`)" (note the backticks surrounding each tag) would be output by Markdown as "(<code><button></code> and <code><a></code>)", which is valid HTML and will not result in an infinite loop in html5lib.
If, in the event that the Markdown input is coming from an untrusted third party, then you absolutely should be sanitizing it before passing it on to anything else.
That said, one such way to sanitize (my recommendation) is to use the Bleach library [1], which uses html5lib internally. So I guess we're back to that infinite loop.
[1]: http://bleach.readthedocs.org/en/latest/index.html -- You received this message because you are subscribed to the Google Groups "html5lib-discuss" group. To post to this group, send an email to html5lib-discuss@googlegroups.com. To unsubscribe from this group, send email to html5lib-discuss+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/html5lib-discuss?hl=en-GB.