[
https://issues.apache.org/jira/browse/JENA-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17433650#comment-17433650
]
ASF subversion and git services commented on JENA-2186:
-------------------------------------------------------
Commit e9d1b15be7ace9947ea5858d5cf61ce47113c562 in jena's branch
refs/heads/main from Andy Seaborne
[ https://gitbox.apache.org/repos/asf?p=jena.git;h=e9d1b15 ]
Merge pull request #1090 from afs/char-FFFD
JENA-2186: Write FFFD as Unicode escape
> Write U+FFFD as Unicode escape
> ------------------------------
>
> Key: JENA-2186
> URL: https://issues.apache.org/jira/browse/JENA-2186
> Project: Apache Jena
> Issue Type: Improvement
> Affects Versions: Jena 4.2.0
> Reporter: Andy Seaborne
> Priority: Major
> Fix For: Jena 4.3.0
>
>
> U+FFFD (Unicode replacement character) arises when there is an encoding
> mismatch between the input bytes and UTF-8 (see the [wikipedia
> article|https://en.wikipedia.org/wiki/Specials_(Unicode_block)#Replacement_character]).
> The tokenizer for Turtle/N-Triple etc raises a warning when a literal U+FFFD
> is encountered to notify users/applications of potential problems.
> The tokenizer does not warn if it is written intentionally in the input
> stream as {{\uFFFD}} (6 characters).
> The write should this unicode escape form so charcater FFFD is written and
> read in again without warning.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)