testsuite

Mirar @ Pike developers forum Wed, 17 Sep 2014 05:10:44 -0700

 >Well, yes, a HTML tokenizer would be useful. HTML5 has a very readable
 >specification.


So maybe it's time for a new tool.

 >Ironically enough it is about 10 times slower than ye olde Opera HTML5
 >parser at actually parsing html. :)

Yes, but I believe it was written to search for specific tags, not
parse every single tag or even to build a datastructure around it. So
it's naturally pretty bad at anything not RXML (as RXML were at the
time, too, probably) :)

 >I have seriously considered writing one. But the name 'Parser.HTML' is
 >already taken. :)

Which is bad. But it shouldn't be the largest obstacle. :)

Use a subtree. Parser.HTML.Tokenizer?

testsuite

Reply via email to