-- Brace yourself: a very long post is coming --

Hi,

One month after the official GSoC start, I want to share with you what's in std.experimental.xml and what will hopefully be there. If you have any question/improvement or anything to say, just leave a comment here or an issue on GitHub (https://github.com/lodo1995/experimental.xml).

In particular, if you think there are problems with the current structure of the project, or major flaws in the APIs, that will be very difficult to solve at a later stage, please let me know. (Walter and Andrei, I'd really appreciate your feedback here).

Thank you in advance to all who will take time to read this...

What is working?
- Four lexers are provided to abstract different kinds of input from the other layers, providing different speed characteristics; - The parser splits the document into nodes, doing most of the hard work; - A cursor sits on top of the parser, providing an API to advance in the document and get information about the current node; it supports string interning, which can drastically lower memory consumption (given that most nodes share names and attributes); - A validating cursor is the same as a cursor, but allows the user to plug custom validators, that are executed while advancing in the input; in the future the library will provide some predefined validators to use with it; - A very simple SAX API built on top of the cursor API is the last thing added and tested; - A partial reimplementation of std.xml is there; when completed it will allow a gradual code transition.

What am I working on right now?
I'm trying to implement the DOM level 3 API. The API per se is not that difficult, but the infrastructure I'm building around it is a hell. In fact, I'm trying to make the DOM nodes reference counted and allocated with a custom allocator, to allow their usage in @nogc code. This is quite painful (because the DOM has lots of circular references, and "normal" reference counting does not work with them), but with enough time I will probably manage to make it work.

What is planned for the near future?
- When the DOM classes will be usable (even if not 100% complete) I will start working on a DOM parser to build them from the source; - DTD check and entity substitution have to be implemented, and they will (I hope) fit nicely as pluggable components for the validating cursor;
- And of course some APIs to output XML.

What is (incidentally) inside the repository?
- Along with the DOM classes comes a wrapper that allows to allocate classes with a custom allocator and reference count them (that is, a RefCounted!T that works only for classes); - A wonderful (or maybe not) benchmark driver that benchmarks the various components with various kinds of random generated files and prints some wonderful statistics and graphs; - Needed by the benchmarking code, a simple API to collect statistical infos (average, median, deviation) from a range of measures; - Needed by the cursor API, an Interner that can intern not only strings, but any array or class.

Thank you again for your time and help.

Lodovico Giaretta

Reply via email to