Re: Spurious warning when using exotic XML names?

Andy Seaborne Tue, 17 Nov 2015 12:03:18 -0800

It turns out that it is not so simple :-)

Where did this XML come from?


tl;dr
JENA-1071 : https://issues.apache.org/jira/browse/JENA-1071

Xerces support XML 1.0 - 4th edition which does not include codepointU+0370.


Workarounds
* Turn off the warning.
* Use rdf:about, not rdf:ID.

The long story:

http://www.w3.org/TR/xml/#NT-NameStartChar is a reference to "XML 1.0(Fifth edition)" and even that is only Unicode 5.0.0. Greek Heta [Ͱ],or U+0370 was added Unicode version 5.1 but is in the codepoint rangesfor the 5th edition.


"XML 1.0 (Fourth Edition)" does not include U+0370.

The Xerces 2.11.0 implements XML 1.0 Fourth Edition (and you are usingthe earlier 2.10.0 - so simply upgrading will not help here though agood idea for lots of other reasons).

The XML parser in the Java8 JDK (which happens to be fork of Xerces fromway back (2.7.1) also seems to be 4th edition. IBM Java7 is a fork of 2.9.

Now both Xerces 2.11.0 and Java8 JDK do happen to support a check forXML11 chars where XML11Char.isXML11ValidNCName. That is not XML 1.1support.


        Andy

[Ͱ]
https://en.wikipedia.org/wiki/Heta

Re: Spurious warning when using exotic XML names?

Reply via email to