On Friday, 9 February 2018 at 21:15:33 UTC, Jonathan M Davis wrote:
I have multiple projects that need an XML parser, and std_experimental_xml is clearly going nowhere, with the guy who wrote it having disappeared into the ether, so I decided to break down and write one. I've kind of wanted to for years, but I didn't want to spend the time on it. However, sometime last year I finally decided that I had to, and it's been what I've been working on in my free time for a while now. And it's finally reached the point when it makes sense to release it - hence this post.

Currently, dxml contains only a range-based StAX / pull parser and related helper functions, but the plan is to add a DOM parser as well as two writers - one which is the writer equivalent of a StaX parser, and one which is DOM-based. However, in theory, the StAX parser is complete and quite useable as-is - though I expect that I'll be adding more helper functions to make it easier to use, and if you find that you're doing a particular operation with it frequently and that that operation is overly verbose, please point it out so that maybe a helper function can be added to improve that use case - e.g. I'm thinking of adding a function similar to std.getopt.getopt for handling attributes, because I personally find that dealing with those is more verbose than I'd like. Obviously, some stuff is just going to do better with a DOM parser, but thus far, I've found that a StAX parser has suited my needs quite well. I have no plans to add a SAX parser, since as far as I can tell, SAX parsers are just plain worse than StAX parsers, and the StAX approach is quite well-suited to ranges.

Of note, dxml does not support the DTD section beyond what is required to parse past it, since supporting it would make it impossible for the parser to return slices of the original input beyond the case where strings are used (and it would be forced to allocate strings in some cases, whereas dxml does _very_ minimal heap allocation right now), and parsing the DTD section signicantly increases the complexity of the parser in order to support something that I honestly don't think should ever have been part of the XML standard and is unnecessary for many, many XML documents. So, if you're dealing with XML documents that contain entity references that are declared in the DTD section and then used outside of the DTD section, then dxml will not support them, but it will work just fine if a DTD section is there so long as it doesn't declare any entity references that are then referenced in the document proper.

Hopefully, the documentation is clear enough, but obviously, I'm not the best judge of that. So, have at it.

Documentation: http://jmdavisprog.com/docs/dxml/0.1.0/
Github: https://github.com/jmdavis/dxml
Dub: http://code.dlang.org/packages/dxml

- Jonathan M Davis

This is going to be really useful for people like me who works with webservices using soap.

Thanks for the great work.

Reply via email to