afs opened a new issue, #2167:
URL: https://github.com/apache/jena/issues/2167

   ### Version
   
   5.0.0
   
   ### Feature
   
   The Jena 4.x behaviour for writing URIs is to write whatever the string is.
   URIs from a parser are correct by RFC3986.
   
   But applications can create URIs from strings and include characters that do 
not pass the Turtle (etc) grammar rule of 
   
   ```
   IRIREF    ::=     '<' ([^#x00-#x20<>"{}|^`\] | UCHAR)* '>'
   ```
   The rule on its own is not enough - a IRIREF must also conform to the syntax 
of RFC3986/7.
   
   Note: URIs with spaces in are bad URIs.
   
   Jena5 is an opportunity to change the behaviour of Jena.
   
   What should Jena do?
   
   Choices:
   
   - Same as Jena4
   - Write a UCHAR (`\uNNNN`) - legal by that `IRIREF`, illegal as a URI (`\` 
is not legal in a URI).
   - Write `percent encoded output %NN` - this changes the URI to a a different 
URI (the bad character is changed to three characters `%`, `X`, `X` for hex 
characters).
   - Some mixture
   - Throw an exception and stop the output.
   
   ### Are you interested in contributing a solution yourself?
   
   Perhaps?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to