afs opened a new issue, #2167:
URL: https://github.com/apache/jena/issues/2167
### Version
5.0.0
### Feature
The Jena 4.x behaviour for writing URIs is to write whatever the string is.
URIs from a parser are correct by RFC3986.
But applications can create URIs from strings and include characters that do
not pass the Turtle (etc) grammar rule of
```
IRIREF ::= '<' ([^#x00-#x20<>"{}|^`\] | UCHAR)* '>'
```
The rule on its own is not enough - a IRIREF must also conform to the syntax
of RFC3986/7.
Note: URIs with spaces in are bad URIs.
Jena5 is an opportunity to change the behaviour of Jena.
What should Jena do?
Choices:
- Same as Jena4
- Write a UCHAR (`\uNNNN`) - legal by that `IRIREF`, illegal as a URI (`\`
is not legal in a URI).
- Write `percent encoded output %NN` - this changes the URI to a a different
URI (the bad character is changed to three characters `%`, `X`, `X` for hex
characters).
- Some mixture
- Throw an exception and stop the output.
### Are you interested in contributing a solution yourself?
Perhaps?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]