[
https://issues.apache.org/jira/browse/JENA-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17712840#comment-17712840
]
Jan Martin Keil commented on JENA-2351:
---------------------------------------
Agree to both: Jena should never output illegal IRIs and to be liberal on input
is good. However, Jena by now provides illegal output due to liberality on the
input. I see the following options:
# Expecting library users to filter IRIs with linebreak before serialization -
Jena becomes very hard to use, not what one expects from a library.
# -Refuse SPARQL answers with illegal IRIs - against Postel's Law-
# Skip statements with illegal IRIs during SPARQL query - this might cause
strange behavior in the software, especially if central statement get skipped.
# Replace IRI with blank node during the query - the invalid IRI gets lost.
connections in the source might get lost, if several queries are affected and
blank node replacements are not consistent between queries. no access for
library users to the original IRI to e.g. fix such problems.
# Skip statements with illegal IRIs during serialization - this might cause
strange behavior in downstream software.
# Adapt illegal IRI (e.g. percent encoding) during serialization - library
invents new IRIs that are not really useful, as they do not link to original
resources.
# -Escape IRI during serialization - not standard conform-
# Replace IRI with blank node during serialization - the invalid IRI gets lost
(which is not much loss), but no statements / links in the model get lost.
I think, 8 is the best of these. Any further ideas?
Regarding the warning: I also get a warning during the query. Initially it was
hidden to me, due to bad logging configuration. Full example:
{code:java}
import org.apache.jena.query.QueryExecution;
import org.apache.jena.rdf.model.Model;
import org.apache.jena.rdf.model.ModelFactory;
import org.junit.jupiter.api.Test;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
public class Example {
@Test
public void test() throws IOException {
Model model = ModelFactory.createDefaultModel();
QueryExecution.service("https://dbpedia.org/sparql/")
.query("CONSTRUCT {?s ?p ?o} WHERE
{BIND(<http://dbpedia.org/resource/Neutron_Star_Interior_Composition_Explorer>
as ?s) BIND(<http://xmlns.com/foaf/0.1/depiction> as ?p) ?s ?p ?o}")
.build().execConstruct(model);
model.write(System.out,"NT");
File file = File.createTempFile("test",".nt");
model.write(new FileOutputStream(file),"NT");
ModelFactory.createDefaultModel()
.read(new FileInputStream(file),"","NT")
.write(System.out,"NT");
}
}
{code}
> Newline (U+000A) in IRIs not escaped during NT/TTL/NQ/TRIG serialization
> -------------------------------------------------------------------------
>
> Key: JENA-2351
> URL: https://issues.apache.org/jira/browse/JENA-2351
> Project: Apache Jena
> Issue Type: Bug
> Components: RIOT
> Affects Versions: Jena 4.7.0
> Reporter: Jan Martin Keil
> Priority: Major
>
> [Newline characters (U+000A) in
> IRIs|https://github.com/dbpedia/extraction-framework/issues/748] are not
> escaped during the serialization of a model or datasets into a format of the
> turtle family. This results in invalid files, which Jena is not able to read
> anymore. Please not the following tests:
> {code:java}
> import org.apache.jena.query.Dataset;
> import org.apache.jena.query.DatasetFactory;
> import org.apache.jena.rdf.model.*;
> import org.apache.jena.riot.Lang;
> import org.apache.jena.riot.RDFDataMgr;
> import org.junit.jupiter.api.Test;
> import java.io.File;
> import java.io.FileInputStream;
> import java.io.FileOutputStream;
> import java.io.IOException;
> public class Example {
> @Test
> public void rdfXml() throws IOException {
> Property someProperty =
> ResourceFactory.createProperty("http://example.org/property");
> Model model = ModelFactory.createDefaultModel();
>
> model.createResource("http://example.org/aaa/\nbbb").addProperty(someProperty,"a
> string");
> System.out.println("\nRDF/XML:\n");
> model.write(System.out,"RDF/XML");
> // test write and read
> File file = File.createTempFile("example",".rdf");
> model.write(new FileOutputStream(file),"RDF/XML");
> ModelFactory.createDefaultModel().read(new
> FileInputStream(file),"","RDF/XML");
> }
> @Test
> public void ttl() throws IOException {
> Property someProperty =
> ResourceFactory.createProperty("http://example.org/property");
> Model model = ModelFactory.createDefaultModel();
>
> model.createResource("http://example.org/aaa/\nbbb").addProperty(someProperty,"a
> string");
> System.out.println("\nTTL:\n");
> model.write(System.out,"TTL");
> // test write and read
> File file = File.createTempFile("example",".ttl");
> model.write(new FileOutputStream(file),"TTL");
> ModelFactory.createDefaultModel().read(new
> FileInputStream(file),"","TTL");
> }
> @Test
> public void nTriples() throws IOException {
> Property someProperty =
> ResourceFactory.createProperty("http://example.org/property");
> Model model = ModelFactory.createDefaultModel();
>
> model.createResource("http://example.org/aaa/\nbbb").addProperty(someProperty,"a
> string");
> System.out.println("\nN-TRIPLE:\n");
> model.write(System.out,"N-TRIPLE");
> // test write and read
> File file = File.createTempFile("example",".nt");
> model.write(new FileOutputStream(file),"N-TRIPLE");
> ModelFactory.createDefaultModel().read(new
> FileInputStream(file),"","N-TRIPLE");
> }
> @Test
> public void nq() throws IOException {
> Property someProperty =
> ResourceFactory.createProperty("http://example.org/property");
> Model model1 = ModelFactory.createDefaultModel();
>
> model1.createResource("http://example.org/aaa/\nbbb").addProperty(someProperty,"a
> string");
> Model model2 = ModelFactory.createDefaultModel();
>
> model2.createResource("http://example.org/aaa/\nbbb").addProperty(someProperty,"a
> string");
> Dataset dataset = DatasetFactory.createGeneral();
> dataset.setDefaultModel(model1);
> dataset.addNamedModel("http://example.org/namedGraph",model2);
> System.out.println("\nNQ:\n");
> RDFDataMgr.write(System.out, dataset, Lang.NQ) ;
> // test write and read
> File file = File.createTempFile("example", ".nq");
> RDFDataMgr.write(new FileOutputStream(file), dataset, Lang.NQ) ;
> RDFDataMgr.read(DatasetFactory.createGeneral(), new
> FileInputStream(file), Lang.NQ) ;
> }
> @Test
> public void trig() throws IOException {
> Property someProperty =
> ResourceFactory.createProperty("http://example.org/property");
> Model model1 = ModelFactory.createDefaultModel();
>
> model1.createResource("http://example.org/aaa/\nbbb").addProperty(someProperty,"a
> string");
> Model model2 = ModelFactory.createDefaultModel();
>
> model2.createResource("http://example.org/aaa/\nbbb").addProperty(someProperty,"a
> string");
> Dataset dataset = DatasetFactory.createGeneral();
> dataset.setDefaultModel(model1);
> dataset.addNamedModel("http://example.org/namedGraph",model2);
> System.out.println("\nTRIG:\n");
> RDFDataMgr.write(System.out, dataset, Lang.TRIG) ;
> // test write and read
> File file = File.createTempFile("example", ".trig");
> RDFDataMgr.write(new FileOutputStream(file), dataset, Lang.TRIG) ;
> RDFDataMgr.read(DatasetFactory.createGeneral(), new
> FileInputStream(file), Lang.TRIG) ;
> }
> }
> {code}
> Outputs (stack traces truncated):
> {code:java}
> N-TRIPLE:
> <http://example.org/aaa/
> bbb> <http://example.org/property> "a string" .
> Apr. 15, 2023 10:01:45 PM
> org.apache.jena.riot.system.ErrorHandlerFactory$ErrorLogger logError
> SCHWERWIEGEND: [line: 2, col: 1 ] Broken IRI (newline):
> http://example.org/aaa/
> org.apache.jena.riot.RiotException: [line: 2, col: 1 ] Broken IRI (newline):
> http://example.org/aaa/
> at
> org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:163)
> ...
> {code}
> {code:java}
> RDF/XML:
> <rdf:RDF
> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
> xmlns:j.0="http://example.org/">
> <rdf:Description rdf:about="http://example.org/aaa/
bbb">
> <j.0:property>a string</j.0:property>
> </rdf:Description>
> </rdf:RDF>
> {code}
> {code:java}
> NQ:
> <http://example.org/aaa/
> bbb> <http://example.org/property> "a string" .
> <http://example.org/aaa/
> bbb> <http://example.org/property> "a string" <http://example.org/namedGraph>
> .
> Apr. 15, 2023 10:01:45 PM
> org.apache.jena.riot.system.ErrorHandlerFactory$ErrorLogger logError
> SCHWERWIEGEND: [line: 2, col: 1 ] Broken IRI (newline):
> http://example.org/aaa/
> org.apache.jena.riot.RiotException: [line: 2, col: 1 ] Broken IRI (newline):
> http://example.org/aaa/
> at
> org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:163)
> ...
> {code}
> {code:java}
> TTL:
> <http://example.org/aaa/
> bbb> <http://example.org/property> "a string" .
> Apr. 15, 2023 10:01:45 PM
> org.apache.jena.riot.system.ErrorHandlerFactory$ErrorLogger logError
> SCHWERWIEGEND: [line: 2, col: 1 ] Broken IRI (newline):
> http://example.org/aaa/
> org.apache.jena.riot.RiotException: [line: 2, col: 1 ] Broken IRI (newline):
> http://example.org/aaa/
> at
> org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:163)
> ...
> {code}
> {code:java}
> TRIG:
> <http://example.org/aaa/
> bbb> <http://example.org/property> "a string" .
> <http://example.org/namedGraph> {
> <http://example.org/aaa/
> bbb> <http://example.org/property> "a string" .
> }
> Apr. 15, 2023 10:01:45 PM
> org.apache.jena.riot.system.ErrorHandlerFactory$ErrorLogger logError
> SCHWERWIEGEND: [line: 2, col: 1 ] Broken IRI (newline):
> http://example.org/aaa/
> org.apache.jena.riot.RiotException: [line: 2, col: 1 ] Broken IRI (newline):
> http://example.org/aaa/
> at
> org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:163)
> ...
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]