Re: [Vala] Manipulating HTML tag soup in Vala

2016-08-02 Thread mar...@saepia.net
How about using headless webkit for sanitizing phase? It is definitely possible, it's often used in the automated testing of webapps. Then in the local DB you can store post-processed DOM tree instead of original mail. If you want to avoid fetching dependencies, it should be possible in headless

Re: [Vala] Manipulating HTML tag soup in Vala

2016-08-01 Thread Michael Gratton
On Tue, Aug 2, 2016 at 9:04 AM, mar...@saepia.net wrote: how about 2-stage processing? Loading HTML into WebKitGtk, dumping DOM (https://webkitgtk.org/reference/webkit2gtk/stable/WebKitWebPage.html#webkit-web-page-get-dom-document) which contains already parsed structure, sanitizing DOM and

Re: [Vala] Manipulating HTML tag soup in Vala

2016-08-01 Thread mar...@saepia.net
Hello, how about 2-stage processing? Loading HTML into WebKitGtk, dumping DOM ( https://webkitgtk.org/reference/webkit2gtk/stable/WebKitWebPage.html#webkit-web-page-get-dom-document) which contains already parsed structure, sanitizing DOM and displaying serialized version of modified DOM for the f

[Vala] Manipulating HTML tag soup in Vala

2016-08-01 Thread Michael Gratton
Hey all, I'm looking for an HTML tag soup library for Geary, that can load tag soup HTML (i.e. possibly malformed) from a stream, allow some manipulation of it, and re-serialise it for display in WebKitGTK. Ideally, a pull-parser API like libxml2's TextReader or StAX[0] would be great, so th