[
https://issues.apache.org/jira/browse/JENA-878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319237#comment-14319237
]
Stian Soiland-Reyes commented on JENA-878:
------------------------------------------
Jena typically validates the string value first, then do a extraction or
lightweight further parsing from the normalized value in the Xerces
ValidatedInfo. This is the main reason I think that there is such a tight
coupling with xerces.impl.dv here - the ValidationInfo is kept and passed
around for that purpose.
I checked other libraries to see what they do.
Sesame does some regular expressions, which I think might have some false
positives (e.g. allow "53" as a month).
https://bitbucket.org/openrdf/sesame/src/1eae6250de5390fd357b2481c74f62a19cca487e/core/model/src/main/java/org/openrdf/model/datatypes/XMLDatatypeUtil.java?at=master
Sesame has its own DateTime parser:
https://bitbucket.org/openrdf/sesame/src/1eae6250de5390fd357b2481c74f62a19cca487e/core/model/src/main/java/org/openrdf/model/datatypes/XMLDateTime.java?at=master
Clerezza parses the strings using regular java.lang methods like
Boolean.valueOf and see if that fails.
https://github.com/apache/clerezza/blob/master/rdf.core/src/main/java/org/apache/clerezza/rdf/core/impl/SimpleLiteralFactory.java
For the date formats (those are the only tricky ones), Clerezza uses a custom
W3CDateFormat, a specialization of java.text.DateFormat.
https://github.com/apache/clerezza/blob/master/rdf.core/src/main/java/org/apache/clerezza/rdf/core/impl/util/W3CDateFormat.java
> Avoid dependencies on xerces.impl
> ---------------------------------
>
> Key: JENA-878
> URL: https://issues.apache.org/jira/browse/JENA-878
> Project: Apache Jena
> Issue Type: Task
> Components: Jena
> Affects Versions: Jena 2.13.0
> Reporter: Stian Soiland-Reyes
> Priority: Minor
>
> Building jena-osgi complains about xerces.impl dependencies:
> > [WARNING] Bundle org.apache.jena:jena-osgi:bundle:2.12.2-SNAPSHOT : Unused
> > Private-Package instructions, no such package(s) on the class path: [!*]
> > [WARNING] Bundle org.apache.jena:jena-osgi:bundle:2.12.2-SNAPSHOT : Export
> > com.hp.hpl.jena.datatypes.xsd, has 1, private references
> > [org.apache.xerces.impl.dv],
> {code}
> stain@biggie-utopic:~/src/jena/jena-core/src/main/java/com/hp/hpl/jena/datatypes$
> grep -r xerces.*impl .
> ./xsd/XSDDatatype.java:import org.apache.xerces.impl.dv.* ;
> ./xsd/XSDDatatype.java:import org.apache.xerces.impl.dv.util.Base64 ;
> ./xsd/XSDDatatype.java:import org.apache.xerces.impl.dv.util.HexBin ;
> ./xsd/XSDDatatype.java:import org.apache.xerces.impl.dv.xs.DecimalDV ;
> ./xsd/XSDDatatype.java:import org.apache.xerces.impl.dv.xs.XSSimpleTypeDecl ;
> ./xsd/XSDDatatype.java:import
> org.apache.xerces.impl.validation.ValidationState ;
> ./xsd/XSDhexBinary.java:import org.apache.xerces.impl.dv.util.HexBin ;
> ./xsd/XSDbase64Binary.java:import org.apache.xerces.impl.dv.util.Base64 ;
> ./xsd/impl/XSDGenericType.java:import org.apache.xerces.impl.dv.XSSimpleType;
> {code}
> It is not good style to depend on *.impl of a package - it is liable to fall
> over at some point. jena-osgi complains, but works in this particular case,
> because xercesImpl is shadowed in.
> Some/all of these (base64) are available through more official packages -
> org.apache.commons.codec.binary.Base64 comes to mind.
> https://commons.apache.org/proper/commons-codec/apidocs/org/apache/commons/codec/binary/Base64.html
> https://commons.apache.org/proper/commons-codec/apidocs/org/apache/commons/codec/binary/Hex.html
> So this task suggests to replace these dependencies with commons-codec
> versions. Remember to add commons-codec to jena-osgi as well!
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)