I am indeed planning to isolate Chawan's html5 parser into a separate library. Right now I'm evaluating the best way to write an API that doesn't involve bringing in half of Chawan as a dependency; preferably it would work similarly to [html5ever](https://github.com/servo/html5ever), so you could supply your own DOM implementation. (Eventually the library could provide a basic DOM skeleton for ease of use.)
Not sure if putting it in the stdlib is the best idea, with the tokenizer it's like 4k lines of code. That's quite the liability for maintainers, especially when they are trying to slim down the stdlib. (Not to mention it depends on Chawan's [decoderstream](https://git.sr.ht/~bptato/chawan/tree/master/item/src/encoding/decoderstream.nim), which is again a hell to integrate.) In short, I would rather make it a separate library.