On Friday, 9 February 2018 at 21:15:33 UTC, Jonathan M Davis
wrote:
I have multiple projects that need an XML parser, and
std_experimental_xml is clearly going nowhere, with the guy who
wrote it having disappeared into the ether, so I decided to
break down and write one. I've kind of wanted to for years, but
I didn't want to spend the time on it. However, sometime last
year I finally decided that I had to, and it's been what I've
been working on in my free time for a while now. And it's
finally reached the point when it makes sense to release it -
hence this post.
Currently, dxml contains only a range-based StAX / pull parser
and related helper functions, but the plan is to add a DOM
parser as well as two writers - one which is the writer
equivalent of a StaX parser, and one which is DOM-based.
However, in theory, the StAX parser is complete and quite
useable as-is - though I expect that I'll be adding more helper
functions to make it easier to use, and if you find that you're
doing a particular operation with it frequently and that that
operation is overly verbose, please point it out so that maybe
a helper function can be added to improve that use case - e.g.
I'm thinking of adding a function similar to std.getopt.getopt
for handling attributes, because I personally find that dealing
with those is more verbose than I'd like. Obviously, some stuff
is just going to do better with a DOM parser, but thus far,
I've found that a StAX parser has suited my needs quite well. I
have no plans to add a SAX parser, since as far as I can tell,
SAX parsers are just plain worse than StAX parsers, and the
StAX approach is quite well-suited to ranges.
Of note, dxml does not support the DTD section beyond what is
required to parse past it, since supporting it would make it
impossible for the parser to return slices of the original
input beyond the case where strings are used (and it would be
forced to allocate strings in some cases, whereas dxml does
_very_ minimal heap allocation right now), and parsing the DTD
section signicantly increases the complexity of the parser in
order to support something that I honestly don't think should
ever have been part of the XML standard and is unnecessary for
many, many XML documents. So, if you're dealing with XML
documents that contain entity references that are declared in
the DTD section and then used outside of the DTD section, then
dxml will not support them, but it will work just fine if a DTD
section is there so long as it doesn't declare any entity
references that are then referenced in the document proper.
Hopefully, the documentation is clear enough, but obviously,
I'm not the best judge of that. So, have at it.
Documentation: http://jmdavisprog.com/docs/dxml/0.1.0/
Github: https://github.com/jmdavis/dxml
Dub: http://code.dlang.org/packages/dxml
- Jonathan M Davis
This is going to be really useful for people like me who works
with webservices using soap.
Thanks for the great work.