Hello,

how about 2-stage processing? Loading HTML into WebKitGtk, dumping DOM (
https://webkitgtk.org/reference/webkit2gtk/stable/WebKitWebPage.html#webkit-web-page-get-dom-document)
which contains already parsed structure, sanitizing DOM and displaying
serialized version of modified DOM for the future use?

It should be more secure, too.

m.

2016-08-01 10:01 GMT+02:00 Michael Gratton <m...@vee.net>:

>
> Hey all,
>
> I'm looking for an HTML tag soup library for Geary, that can load tag soup
> HTML (i.e. possibly malformed) from a stream, allow some manipulation of
> it, and re-serialise it for display in WebKitGTK. Ideally, a pull-parser
> API like libxml2's TextReader or StAX[0] would be great, so the whole
> document does not need to be kept in memory as it is processed.
>
> These are the ones I know about:
>
> libxml2:
> - Pros: Has a pull parser API, has a HTML4 tag soup parser, installed
> everywhere
> - Cons: Pull parser doesn't work with HTML parser without reading whole
> document into memory, HTML parser out of date(?)
>
> GXml:
> - Pros: Nice Vala API, uses libxml2 under the hood
> - Cons: Not a pull parser, loads whole document into memory, doesn't seem
> to be packaged for any distros, doesn't use the libxml HTML parser(?)
>
> Others:
> - WebKitGTK+: Great tag soup parser, no pull API, doesn't allow
> manipulating the markup before displaying it (which is the main reason I
> need to parse the HTML beforehand)
> - XML Bird: Nice Vala API, but not a pull parser or a HTML parser
>
> So none of these seem to completely fit the bill. Are there any other
> options out there that I have missed? Has anyone else had parse tag soup in
> Vala?
>
> Ta!
> //Mike
>
> [0] - <https://en.wikipedia.org/wiki/StAX>
>
> --
> ⊨ Michael Gratton, Percept Wrangler.
> ⚙ <http://mjog.vee.net/>
>
>
> _______________________________________________
> vala-list mailing list
> vala-list@gnome.org
> https://mail.gnome.org/mailman/listinfo/vala-list
>
_______________________________________________
vala-list mailing list
vala-list@gnome.org
https://mail.gnome.org/mailman/listinfo/vala-list

Reply via email to