afs opened a new pull request, #2769:
URL: https://github.com/apache/jena/pull/2769

   GitHub issue resolved #2766
   
   Pull request Description:
   
   Due to Java bytes to string conversion using the JDK conversion, Jena can't 
tell the difference between multibyte characters translated to surrogates 
(legal) and surrogates actually in the in UTF-08 (illegal - UTF-8 does not 
allow surrogates).
   
   The test changes are bug fixes. They are detecting warnings on the 
replacement character but that is explicitly handled, and allowed, controlled 
by a flag, further up.
   
   A deep fix might be possible - but it involves our own UTF-8 decoder and 
will need careful assessment of the performance impact. 
   
   ----
   
    - [x] Commits have been squashed to remove intermediate development commit 
messages.
    - [x] Key commit messages start with the issue number (GH-xxxx)
   
   By submitting this pull request, I acknowledge that I am making a 
contribution to the Apache Software Foundation under the terms and conditions 
of the [Contributor's 
Agreement](https://www.apache.org/licenses/contributor-agreements.html).
   
   ----
   
   See the [Apache Jena "Contributing" 
guide](https://github.com/apache/jena/blob/main/CONTRIBUTING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@jena.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscr...@jena.apache.org
For additional commands, e-mail: pr-h...@jena.apache.org

Reply via email to