Hi Patrik,
On 18/08/18 21:01, [email protected] wrote:
Hello,
I have an issue with reading(*1) the OWL input file using following read()
method from the Model interface.
-
https://jena.apache.org/documentation/javadoc/jena/org/apache/jena/rdf/model/Model.html#read-java.lang.String-java.lang.String-
When an input OWL file has an ".owl" extension, but is in the different syntax
than RDF/XML (Turtle for example), an exception(*2) is thrown, even though I use the
second parameter of the method to define the syntax of the input file as it is
recommended in this tutorial:
-
http://jena.apache.org/documentation/io/rdf-input.html#example-1-common-usage
This text applies as well:
"If the syntax is not as the file extension, a language can be declared:"
File extension ".owl" has a bit of a chequered history.
".owl" has not been formally registered - the closest is OWL1 which says
"As file extension, we recommend to use either .rdf or .owl." but that
is not in the IANA registration.
https://www.w3.org/TR/owl-ref/#MIMEType
And "owl" is also in practice used for any of the OWL syntaxes.
In Jena, (which does not read OWL2 non-RDF syntaxes), ".owl" is
registered as "rdf/xml" to follow OWL1 (when it was the only choice).
Jena only read RDF syntaxes so ".owl" is used for the OWL1 suggestion.
When I change the extension to ".ttl", everything runs fine, the OWL file is
valid. It's not a huge problem for me, right now I am using the InputStream when reading
the input OWL file. I was just curious - is that how it should work? For me, it does not
look like it works correctly, but maybe I have missed something... (I am using Jena in
version 3.8.0)
(*1) example source code using the read() method:
OntModel model = null;
try {
model = ModelFactory.createOntologyModel();
model.read("mrettl.owl", "TURTLE");
The specified language serves as a "hint", rather than force the syntax.
Both hint and force have use cases but this API only has space for one,
and it is as a hint.
The hint is used if the system can't determine the syntax another way.
Here, ".owl" is registered as an RDF/XML syntax -- file extensions are
used as a sort of "Content-type" and if an RDF syntax "Content-type" is
given, it is assumed to be correct.
You can loose the file extension information as you have discovered with
InputStream in = IO.openFile("mrettl.owl");
m.read(in, null, "TURTLE");
and for plain model reading, to access the "force lang":
RDFParser.create()
.source("mrettl.owl")
.base(baseUri)
.forceLang(lang) <<-------
.parse(destination);
} catch (Exception e) {
e.printStackTrace();
}
(*2) thrown Exception:
Decoded: the RDF/XML parser tried to run
org.apache.jena.riot.RiotException: [line: 1, col: 1 ] Content is not allowed
in prolog.
at
org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:147)
at
org.apache.jena.riot.lang.ReaderRIOTRDFXML$ErrorHandlerBridge.fatalError(ReaderRIOTRDFXML.java:313)
at
org.apache.jena.rdfxml.xmlinput.impl.ARPSaxErrorHandler.fatalError(ARPSaxErrorHandler.java:47)
at
org.apache.jena.rdfxml.xmlinput.impl.XMLHandler.warning(XMLHandler.java:199)
at
org.apache.jena.rdfxml.xmlinput.impl.XMLHandler.fatalError(XMLHandler.java:229)
at
java.xml/com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(Unknown
Source)
at
java.xml/com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown
Source)
at
java.xml/com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown
Source)
at
java.xml/com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(Unknown
Source)
at
java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown
Source)
at
java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown
Source)
at
java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown
Source)
at
java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
at
java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)
at
java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)
at
java.xml/com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown
Source)
at
java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown
Source)
at
java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown
Source)
at
org.apache.jena.rdfxml.xmlinput.impl.RDFXMLParser.parse(RDFXMLParser.java:171)
at org.apache.jena.rdfxml.xmlinput.ARP.load(ARP.java:118)
at
org.apache.jena.riot.lang.ReaderRIOTRDFXML.parse(ReaderRIOTRDFXML.java:188)
at
org.apache.jena.riot.lang.ReaderRIOTRDFXML.read(ReaderRIOTRDFXML.java:86)
at org.apache.jena.riot.RDFParser.read(RDFParser.java:352)
at org.apache.jena.riot.RDFParser.parseURI(RDFParser.java:321)
at org.apache.jena.riot.RDFParser.parse(RDFParser.java:295)
at
org.apache.jena.riot.RDFParserBuilder.parse(RDFParserBuilder.java:500)
at org.apache.jena.riot.RDFDataMgr.parseFromURI(RDFDataMgr.java:890)
at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:221)
at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:190)
at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:120)
at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:111)
at
org.apache.jena.riot.adapters.RDFReaderRIOT.read(RDFReaderRIOT.java:76)
at org.apache.jena.rdf.model.impl.ModelCom.read(ModelCom.java:281)
at
org.apache.jena.ontology.impl.OntModelImpl.readDelegate(OntModelImpl.java:3091)
at
org.apache.jena.ontology.impl.OntModelImpl.read(OntModelImpl.java:2185)
at
org.apache.jena.ontology.impl.OntModelImpl.read(OntModelImpl.java:2148)
at testing.ReadTest.main(ReadTest.java:15)
Thanks,
Patrik