[
https://issues.apache.org/jira/browse/JENA-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17295400#comment-17295400
]
Andy Seaborne commented on JENA-2061:
-------------------------------------
Thanks for the report.
FYI: This works in application/results-set+json.
And it would work in XML 1.1 : https://www.w3.org/TR/xml11/#charsets raw or as
{{&\#001B;}}
There is cost in checking all results, start to finish, before generating the
HTTP status code which is that it breaks end-to-end streaming.
The reason the CONSTRUCT case is a 400 is because of buffering. It happens to
be written and the exception thrown an error before the first buffer of
characters is written, giving time for the HTTP status code to be set to 400.
If it were later in the result set, it will be broken XML.
> Fuseki XML result serializer outputs characters that are illegal per XML spec
> -----------------------------------------------------------------------------
>
> Key: JENA-2061
> URL: https://issues.apache.org/jira/browse/JENA-2061
> Project: Apache Jena
> Issue Type: Bug
> Components: Fuseki
> Affects Versions: Jena 3.15.0
> Environment: We confirmed the reported behavior in three environments:
> * CentOS 8 with OpenJDK 1.8.0_282
> * macOS 10.15 with OpenJDK 13.0.2
> * macOS 10.14 with Java 8 JDK
> Reporter: Julian Gonggrijp
> Priority: Major
>
> Due to a mistake at our end, our application inserted a literal into the
> triple store that included ASCII character {{0x001B}} (below represented as
> {{ESC}}):
> {code:none}
> PREFIX oa: <http://www.w3.org/ns/oa#>
> PREFIX our: <http://example.org/>
> INSERT DATA {
> our:example oa:exact "foo ESC bar" .
> }
> {code}
> While this was unintentional and I can't really think of a situation where
> inserting control characters is desirable, this is nevertheless allowed by
> the SPARQL and Turtle specifications. I think. Please correct me if I'm
> wrong. Regardless, Fuseki accepts this update request.
> When we subsequently retrieve the data through a {{SELECT}} query with the
> {{ACCEPT}} header set to {{application/sparql-results+xml}}, the XML includes
> this {{ESC}} character again:
> {code:none}
> SELECT ?c WHERE { ?a ?b ?c . }
> {code}
> {code:xml}
> <?xml version="1.0"?>
> <sparql xmlns="http://www.w3.org/2005/sparql-results#">
> <head>
> <variable name="c"/>
> </head>
> <results>
> <result>
> <binding name="c">
> <literal>foo ESC bar</literal>
> </binding>
> </result>
> </results>
> </sparql>
> {code}
> This leads to errors when the result XML is parsed downstream.
> If we do a {{CONSTRUCT}} with {{application/rdf+xml}}, the Fuseki server
> returns a {{400 Bad Request}} instead, which I have double-checked is due to
> the presence of the {{ESC}} character.
> *Edit to add:* the set of valid characters per the XML spec is defined
> [here|https://www.w3.org/TR/REC-xml/#charsets].
--
This message was sent by Atlassian Jira
(v8.3.4#803005)