I hadn't realised how intrusive the use of Xerces was - it's all XML parsing in the app, not just RDFXML in Jena because Xerces wires itself into the app and also replaces the XML datatypes factory (it is a subclass of the standard one to add different Duration and XMLGregorianCalendar).

The Duration bug is not fixed in Xerces 2.12.0. The new release is mostly around XML Schema 1.1.

I've now updated the Jena top level NOTICE taking the relevant text from Xerces's NOTICE. The jena-core NOTICE is updated as well; we could/should roll up all the sub NOTICE/LICENSE and just have the top level one for source and the ones for binaries (download, Fuseki's).

It is now ready for integration.

    Andy

On 02/05/18 07:01, Claude Warren wrote:
I think undepending on Xerces is a good idea as well.  With lots of other
faster parsers to choose from it seems like we should not be forcing apps
to include Xerces as well.

Claude

On Tue, May 1, 2018 at 12:05 PM, Andy Seaborne <[email protected]> wrote:

FYI:

Xerces 2.12.0 is out (as of April 21) though it has not made it to Maven
central.

One thing of interest (to me) is whether it has a bugfixed version of
Duration. JENA-1402

I still think we should un-depend on Xerces.

     Andy


On 28/04/18 20:38, Andy Seaborne wrote:

JENA-1537

While the JDK does have a Xerces derived parser (it split off long before
2.11.0 and separately evolved), it is behind Java9 module "java.xml".

Jena uses Xerces 2.11.0 in two ways - for the datatypes (oaj.datatypes)
and XML parsing (oaj.rdfxml.xmlinput - also known as ARP).  Both make
internal use of Xerces.

The datatypes uses Xerces provide XSD datatypes including validation.

RDFXMLParser uses Xerces SAXParser and in a minor way some other stuff
that isn't in java.xml.sax.

I've had a prototype-hack go at removing Xerces from Jena:
https://github.com/afs/jena-xerces

Datatypes:

* One feature omitted: XSDDatatype.loadUserDefined.

These functions parse XSD scheme datatype definitions. The implementation
calls into the internal XML parsing which would not be legal in Java9
modules if using the JDK built-in parser. It seems to need a fairly
complete XML parser engine.

We should consider dropping this feature.

XML Parsing:

* Looses the check on whether InputStreamReader or FileReader have the
right encoding for the XML document. It hooks into an interface call that
does not seem to be available in a standard SAX parser. (Shouldn't be using
Readers anyway!)

      Andy




Reply via email to