GSoC 2016 - std.experimental.xml after a month

Lodovico Giaretta via Digitalmars-d Thu, 23 Jun 2016 13:08:11 -0700

-- Brace yourself: a very long post is coming --

Hi,

One month after the official GSoC start, I want to share with youwhat's in std.experimental.xml and what will hopefully be there.If you have any question/improvement or anything to say, justleave a comment here or an issue on GitHub(https://github.com/lodo1995/experimental.xml).

In particular, if you think there are problems with the currentstructure of the project, or major flaws in the APIs, that willbe very difficult to solve at a later stage, please let me know.(Walter and Andrei, I'd really appreciate your feedback here).


Thank you in advance to all who will take time to read this...

What is working?

- Four lexers are provided to abstract different kinds of inputfrom the other layers, providing different speed characteristics;- The parser splits the document into nodes, doing most of thehard work;- A cursor sits on top of the parser, providing an API to advancein the document and get information about the current node; itsupports string interning, which can drastically lower memoryconsumption (given that most nodes share names and attributes);- A validating cursor is the same as a cursor, but allows theuser to plug custom validators, that are executed while advancingin the input; in the future the library will provide somepredefined validators to use with it;- A very simple SAX API built on top of the cursor API is thelast thing added and tested;- A partial reimplementation of std.xml is there; when completedit will allow a gradual code transition.


What am I working on right now?

I'm trying to implement the DOM level 3 API. The API per se isnot that difficult, but the infrastructure I'm building around itis a hell. In fact, I'm trying to make the DOM nodes referencecounted and allocated with a custom allocator, to allow theirusage in @nogc code. This is quite painful (because the DOM haslots of circular references, and "normal" reference counting doesnot work with them), but with enough time I will probably manageto make it work.


What is planned for the near future?

- When the DOM classes will be usable (even if not 100% complete)I will start working on a DOM parser to build them from thesource;- DTD check and entity substitution have to be implemented, andthey will (I hope) fit nicely as pluggable components for thevalidating cursor;

- And of course some APIs to output XML.

What is (incidentally) inside the repository?

- Along with the DOM classes comes a wrapper that allows toallocate classes with a custom allocator and reference count them(that is, a RefCounted!T that works only for classes);- A wonderful (or maybe not) benchmark driver that benchmarks thevarious components with various kinds of random generated filesand prints some wonderful statistics and graphs;- Needed by the benchmarking code, a simple API to collectstatistical infos (average, median, deviation) from a range ofmeasures;- Needed by the cursor API, an Interner that can intern not onlystrings, but any array or class.


Thank you again for your time and help.

Lodovico Giaretta

GSoC 2016 - std.experimental.xml after a month

Reply via email to