On Wednesday, February 14, 2018 10:03:45 Patrick Schluter via Digitalmars-d-
announce wrote:
> On Tuesday, 13 February 2018 at 22:00:59 UTC, Jonathan M Davis
> wrote:
> > On Tuesday, February 13, 2018 21:18:12 Patrick Schluter via
> >
> > Digitalmars-d- announce wrote:
> >> [...]
> >
> > Well, if dxml just passes the entity references along unparsed
> > beyond validating that the entity reference itself contains
> > valid characters (e.g. it's not something like &.; or & by
> > itself), then dxml would still not be replacing the entity
> > references with anything. Any security or performance problems
> > associated with entity references would be left up to whatever
> > parser parsed the DTD section and then used dxml to parse the
> > rest of the XML and replaced the entity references in dxml's
> > parsing results with whatever they were.
> >
> > The big problem is how the entity references affect the
> > parsing. If start tags can be dropped in and affect the parsing
> > (and it's still not clear to me from the spec whether that's
> > legal - there is a section talking about being nested properly
> > which might indicate that that's not legal, but it's not very
> > specific or clear), and if it's legal to do something like use
> > an entity reference for a tag name - e.g. <&foo;>, then that's
> > a serious problem. And problems like that are the main reason
> > why I completely dropped any attempt to do anything with the
> > DTD section.
> Yikes! In any case, even if I had to implement a parser I would
> tend to not implement this "feature" as it sounds quite
> unreasonable. Only if a real need (i.e. one in the real world,
> not one that could be contrived out of the specs) arises would I
> then potentially implement the real deal.

Well, since folks other than me are going to use this parser, and it's even
potentially going to end up in D's standard library, it needs to at least be
good enough to not let through invalid XML or incorrectly interpret any XML.
It can potentially not support portions of the spec as long as it does so in
a clear and clean manner, but it's going to have to correctly handle
anything that it does handle.

For better or worse, I'm the sort of person who prefers to completely
implement a spec when I'm implementing one, but in this case, it wasn't
really reasonable. Fortunately however, from the perspective of implementing
something that's useful for me personally, the DTD section is completely
unnecessary. From that perspective, processing instructions and CDATA
sections are also unnecessary, since I'd never do anythnig with them, but I
don't think that it would be reasonable to skip those, so they're
implemented. And it's not like they're hard to implement support for, unlike
the DTD section.

- Jonathan M Davis

Reply via email to