[
https://issues.apache.org/jira/browse/JENA-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17712927#comment-17712927
]
Andy Seaborne commented on JENA-2351:
-------------------------------------
Jena can not be expected to fix the problems of other systems. There isn't one
solution; a fix for one app usage is not necessarily what another app might
want.
e.g. parsing - translate character vs let through bad data.
Anything that changes the data should be taking place in a transformation step.
Parse to an {{StreamRDF}} and have the {{StreamRDF}} decide the fixup such as
replacement.
Jena could throw an exception during output as the RDF/XML writers do,
partially by accident. Check before any output is too expensicve (e.g. writing
data larger than RAM) because some formats stream. So if the app wants to check
first, it needs to do that check itself before calling write.
Writing could do as default behaviour is raise an exception in {{QuotedURI}}.
Doing the _horrible_ UCHAR thing could be done in a {{StreamRDF}} on input or
on output. It's one choice - {{%0D}} is another.
FWIW:
{code:java}
boolean x = QueryExecution
.service("https://dbpedia.org/sparql/")
.query("ASK { FILTER ('black' = 'white')}")
.ask();
System.out.println(x);
{code}
(This "no touching the data" kind of issue is not unknown in other systems as
well.)
> Newline (U+000A) in IRIs not escaped during NT/TTL/NQ/TRIG serialization
> -------------------------------------------------------------------------
>
> Key: JENA-2351
> URL: https://issues.apache.org/jira/browse/JENA-2351
> Project: Apache Jena
> Issue Type: Bug
> Components: RIOT
> Affects Versions: Jena 4.7.0
> Reporter: Jan Martin Keil
> Priority: Major
>
> [Newline characters (U+000A) in
> IRIs|https://github.com/dbpedia/extraction-framework/issues/748] are not
> escaped during the serialization of a model or datasets into a format of the
> turtle family. This results in invalid files, which Jena is not able to read
> anymore. Please not the following tests:
> {code:java}
> import org.apache.jena.query.Dataset;
> import org.apache.jena.query.DatasetFactory;
> import org.apache.jena.rdf.model.*;
> import org.apache.jena.riot.Lang;
> import org.apache.jena.riot.RDFDataMgr;
> import org.junit.jupiter.api.Test;
> import java.io.File;
> import java.io.FileInputStream;
> import java.io.FileOutputStream;
> import java.io.IOException;
> public class Example {
> @Test
> public void rdfXml() throws IOException {
> Property someProperty =
> ResourceFactory.createProperty("http://example.org/property");
> Model model = ModelFactory.createDefaultModel();
>
> model.createResource("http://example.org/aaa/\nbbb").addProperty(someProperty,"a
> string");
> System.out.println("\nRDF/XML:\n");
> model.write(System.out,"RDF/XML");
> // test write and read
> File file = File.createTempFile("example",".rdf");
> model.write(new FileOutputStream(file),"RDF/XML");
> ModelFactory.createDefaultModel().read(new
> FileInputStream(file),"","RDF/XML");
> }
> @Test
> public void ttl() throws IOException {
> Property someProperty =
> ResourceFactory.createProperty("http://example.org/property");
> Model model = ModelFactory.createDefaultModel();
>
> model.createResource("http://example.org/aaa/\nbbb").addProperty(someProperty,"a
> string");
> System.out.println("\nTTL:\n");
> model.write(System.out,"TTL");
> // test write and read
> File file = File.createTempFile("example",".ttl");
> model.write(new FileOutputStream(file),"TTL");
> ModelFactory.createDefaultModel().read(new
> FileInputStream(file),"","TTL");
> }
> @Test
> public void nTriples() throws IOException {
> Property someProperty =
> ResourceFactory.createProperty("http://example.org/property");
> Model model = ModelFactory.createDefaultModel();
>
> model.createResource("http://example.org/aaa/\nbbb").addProperty(someProperty,"a
> string");
> System.out.println("\nN-TRIPLE:\n");
> model.write(System.out,"N-TRIPLE");
> // test write and read
> File file = File.createTempFile("example",".nt");
> model.write(new FileOutputStream(file),"N-TRIPLE");
> ModelFactory.createDefaultModel().read(new
> FileInputStream(file),"","N-TRIPLE");
> }
> @Test
> public void nq() throws IOException {
> Property someProperty =
> ResourceFactory.createProperty("http://example.org/property");
> Model model1 = ModelFactory.createDefaultModel();
>
> model1.createResource("http://example.org/aaa/\nbbb").addProperty(someProperty,"a
> string");
> Model model2 = ModelFactory.createDefaultModel();
>
> model2.createResource("http://example.org/aaa/\nbbb").addProperty(someProperty,"a
> string");
> Dataset dataset = DatasetFactory.createGeneral();
> dataset.setDefaultModel(model1);
> dataset.addNamedModel("http://example.org/namedGraph",model2);
> System.out.println("\nNQ:\n");
> RDFDataMgr.write(System.out, dataset, Lang.NQ) ;
> // test write and read
> File file = File.createTempFile("example", ".nq");
> RDFDataMgr.write(new FileOutputStream(file), dataset, Lang.NQ) ;
> RDFDataMgr.read(DatasetFactory.createGeneral(), new
> FileInputStream(file), Lang.NQ) ;
> }
> @Test
> public void trig() throws IOException {
> Property someProperty =
> ResourceFactory.createProperty("http://example.org/property");
> Model model1 = ModelFactory.createDefaultModel();
>
> model1.createResource("http://example.org/aaa/\nbbb").addProperty(someProperty,"a
> string");
> Model model2 = ModelFactory.createDefaultModel();
>
> model2.createResource("http://example.org/aaa/\nbbb").addProperty(someProperty,"a
> string");
> Dataset dataset = DatasetFactory.createGeneral();
> dataset.setDefaultModel(model1);
> dataset.addNamedModel("http://example.org/namedGraph",model2);
> System.out.println("\nTRIG:\n");
> RDFDataMgr.write(System.out, dataset, Lang.TRIG) ;
> // test write and read
> File file = File.createTempFile("example", ".trig");
> RDFDataMgr.write(new FileOutputStream(file), dataset, Lang.TRIG) ;
> RDFDataMgr.read(DatasetFactory.createGeneral(), new
> FileInputStream(file), Lang.TRIG) ;
> }
> }
> {code}
> Outputs (stack traces truncated):
> {code:java}
> N-TRIPLE:
> <http://example.org/aaa/
> bbb> <http://example.org/property> "a string" .
> Apr. 15, 2023 10:01:45 PM
> org.apache.jena.riot.system.ErrorHandlerFactory$ErrorLogger logError
> SCHWERWIEGEND: [line: 2, col: 1 ] Broken IRI (newline):
> http://example.org/aaa/
> org.apache.jena.riot.RiotException: [line: 2, col: 1 ] Broken IRI (newline):
> http://example.org/aaa/
> at
> org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:163)
> ...
> {code}
> {code:java}
> RDF/XML:
> <rdf:RDF
> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
> xmlns:j.0="http://example.org/">
> <rdf:Description rdf:about="http://example.org/aaa/
bbb">
> <j.0:property>a string</j.0:property>
> </rdf:Description>
> </rdf:RDF>
> {code}
> {code:java}
> NQ:
> <http://example.org/aaa/
> bbb> <http://example.org/property> "a string" .
> <http://example.org/aaa/
> bbb> <http://example.org/property> "a string" <http://example.org/namedGraph>
> .
> Apr. 15, 2023 10:01:45 PM
> org.apache.jena.riot.system.ErrorHandlerFactory$ErrorLogger logError
> SCHWERWIEGEND: [line: 2, col: 1 ] Broken IRI (newline):
> http://example.org/aaa/
> org.apache.jena.riot.RiotException: [line: 2, col: 1 ] Broken IRI (newline):
> http://example.org/aaa/
> at
> org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:163)
> ...
> {code}
> {code:java}
> TTL:
> <http://example.org/aaa/
> bbb> <http://example.org/property> "a string" .
> Apr. 15, 2023 10:01:45 PM
> org.apache.jena.riot.system.ErrorHandlerFactory$ErrorLogger logError
> SCHWERWIEGEND: [line: 2, col: 1 ] Broken IRI (newline):
> http://example.org/aaa/
> org.apache.jena.riot.RiotException: [line: 2, col: 1 ] Broken IRI (newline):
> http://example.org/aaa/
> at
> org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:163)
> ...
> {code}
> {code:java}
> TRIG:
> <http://example.org/aaa/
> bbb> <http://example.org/property> "a string" .
> <http://example.org/namedGraph> {
> <http://example.org/aaa/
> bbb> <http://example.org/property> "a string" .
> }
> Apr. 15, 2023 10:01:45 PM
> org.apache.jena.riot.system.ErrorHandlerFactory$ErrorLogger logError
> SCHWERWIEGEND: [line: 2, col: 1 ] Broken IRI (newline):
> http://example.org/aaa/
> org.apache.jena.riot.RiotException: [line: 2, col: 1 ] Broken IRI (newline):
> http://example.org/aaa/
> at
> org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:163)
> ...
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]