On 06/06/16 18:46, Erich Bremer wrote:
Hi,
I used Jena (3.0) to read a RDF/XML file and then write that RDF back
out to a ttl file. When I try to read that ttl file back into a Jena
Model using RDFDataMgr, the following error is thrown:
Exception in thread "main" org.apache.jena.riot.RiotException: [line:
9873, col: 68] Illegal character in IRI (codepoint 0x5E, '^'):
<http://rdf.wwpdb.org/pdb/11BA/pdbx_struct_assembly_prop/1,ABSA_(A[^]...>
Turtle is defined to work with IRIs and the grammar says:
[18] IRIREF ::= '<' ([^#x00-#x20<>"{}|^`\] | UCHAR)* '>'
so it is illegal as an IRI.
RDF/XML, an older standard written before IRI's were finalized, works
with "RDF URI References". They are were designed in anticipation of
where the IRI specs were going but the IRI drafts did change before
becoming final.
Jena tends to favour compatibility - what could be read, remains
readable; the alternative would be at some version stopped accepting
certain RDF/XML that is nowadays not considered "good".
Writing is an attempt to get assumed correct data out.
There are various ways to get junk data in, includes the API which for
efficiency does not check IRIs.
Andy
at
org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:136)
at
org.apache.jena.riot.lang.LangEngine.raiseException(LangEngine.java:165)
at org.apache.jena.riot.lang.LangEngine.nextToken(LangEngine.java:108)
at org.apache.jena.riot.lang.LangEngine.expect(LangEngine.java:145)
at
org.apache.jena.riot.lang.LangEngine.expectOrEOF(LangEngine.java:130)
at
org.apache.jena.riot.lang.LangTurtleBase.expectEndOfTriplesTurtle(LangTurtleBase.java:264)
at
org.apache.jena.riot.lang.LangTurtle.expectEndOfTriples(LangTurtle.java:51)
at
org.apache.jena.riot.lang.LangTurtleBase.triples(LangTurtleBase.java:250)
at
org.apache.jena.riot.lang.LangTurtleBase.triplesSameSubject(LangTurtleBase.java:190)
at
org.apache.jena.riot.lang.LangTurtle.oneTopLevelElement(LangTurtle.java:46)
at
org.apache.jena.riot.lang.LangTurtleBase.runParser(LangTurtleBase.java:89)
at org.apache.jena.riot.lang.LangBase.parse(LangBase.java:42)
at
org.apache.jena.riot.RDFParserRegistry$ReaderRIOTLang.read(RDFParserRegistry.java:176)
at org.apache.jena.riot.RDFDataMgr.process(RDFDataMgr.java:861)
at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:259)
at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:233)
at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:223)
at io.haylyn.atoz.TripleCount.main(TripleCount.java:31)
Lines 9873 of the ttl file plus a few more lines are listed here:
<http://rdf.wwpdb.org/pdb/11BA/pdbx_struct_assembly_prop/1,ABSA_(A^2)>
a PDBo:pdbx_struct_assembly_prop ;
PDBo:of_datablock <http://rdf.wwpdb.org/pdb/11BA> ;
PDBo:pdbx_struct_assembly_prop.biol_id
"1" ;
PDBo:pdbx_struct_assembly_prop.type
"ABSA (A^2)" ;
PDBo:pdbx_struct_assembly_prop.value
"6120" .
Shouldn't I be able to read an RDF file that Jena itself wrote? - Erich