Hi, thanks for attending to this, Andy.
Regards Philipp On Tue, 17 Nov 2015 20:02:43 +0000 UTC, Andy Seaborne <[email protected]> wrote: > It turns out that it is not so simple :-) > > Where did this XML come from? > > tl;dr > JENA-1071 : https://issues.apache.org/jira/browse/JENA-1071 > > Xerces support XML 1.0 - 4th edition which does not include codepoint > U+0370. > > Workarounds > * Turn off the warning. > * Use rdf:about, not rdf:ID. > > The long story: > > http://www.w3.org/TR/xml/#NT-NameStartChar is a reference to "XML 1.0 (Fifth > edition)" and even that is only Unicode 5.0.0. Greek Heta [Ͱ], or U+0370 > was added Unicode version 5.1 but is in the codepoint ranges for the 5th > edition. > > "XML 1.0 (Fourth Edition)" does not include U+0370. > > The Xerces 2.11.0 implements XML 1.0 Fourth Edition (and you are using the > earlier 2.10.0 - so simply upgrading will not help here though a good idea > for lots of other reasons). > > The XML parser in the Java8 JDK (which happens to be fork of Xerces from way > back (2.7.1) also seems to be 4th edition. IBM Java7 is a fork of 2.9. > > Now both Xerces 2.11.0 and Java8 JDK do happen to support a check for XML11 > chars where XML11Char.isXML11ValidNCName. That is not XML 1.1 support. > > Andy > > [Ͱ] > https://en.wikipedia.org/wiki/Heta >
