[Chame](https://git.sr.ht/~bptato/chame) 0.14.0 has been released.

Quick ([re](https://forum.nim-lang.org/t/10367#69029)-)introduction: Chame is 
an HTML5 parser (implementing the WHATWG standard used by modern web browsers) 
that exposes low-level hooks for operating on user-provided DOM 
implementations. Its API design is heavily inspired by Servo's html5ever 
library.

The bad news for existing users is that this release completely breaks the old 
API. The good news is that no more significant API breakages are planned.

A migration guide from the old API can be found in the 
[NEWS](https://git.sr.ht/~bptato/chame/tree/b34ad233bf94715f6ab9cceca91f1296ca8003bc/item/NEWS)
 file. In general, it mostly consists of removing boilerplate, adding "Impl" to 
the end of hooked functions, and adapting your DOM to cooperate with Chame's 
string interning.

So what has changed since the last announcement?

  * The "bag of pointers" interface design has been dropped; we now use mixins 
and an interface definition file that users have to `include`.
  * Tag and attribute names are now treated as interned strings (a user-defined 
"Atom" type); for this implementers have to provide functions to convert 
strings to their Atom implementation and to convert Atoms to/from TagTypes. 
minidom includes a very basic implementation of this.
  * Chame now fully supports special processing of embedded SVG/MathML elements.
  * Chame has been decoupled from the encoding/decoding library 
([Chakasu](https://git.sr.ht/~bptato/chakasu)); it is now completely optional. 
The minidom module now only supports UTF-8; for other encodings, import Chakasu 
and use minidom_cs. The htmlparser module does not know about character 
encodings; it assumes the input is valid UTF-8.
  * Instead of using std/streams, users of the low level interface now must 
pass buffers to parse through parseChunk.
  * Chame now passes the entirety of the tokenizer and tree builder parts of 
html5lib-tests.



The full documentation is available [here](https://chawan.net/doc/chame/).

What is still missing for the 1.0 version?

  * SVG script end tag processing
  * proper parse error reporting



SVG script processing will be implemented soon, I think as an optional 
callback. Needs further investigation.

Parse error reporting might get left out entirely; it is neither useful nor 
correct in its current state, it would take plenty of effort to make it useful 
and correct without affecting performance, and I don't need it. If somebody 
really wants it, I accept patches.

Also, I have been considering some basic improvements in the usability of 
minidom (chunked parsing + exposing some DOM functionality), these will be 
present in 1.0. 

Reply via email to