[
https://issues.apache.org/jira/browse/JENA-129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13119517#comment-13119517
]
Andy Seaborne commented on JENA-129:
------------------------------------
The issue with i18n/normalization-01.ttl is that the tokenization for prefix
names does not include combiningg characters (unicode: [#0300-#036F]).
Priority change to minor. Combing character are unusual in URIs (and lead to
trouble anyway with unequal URIs with identical visual appearance e.g. é and
é are different (the second is two characters, the second being a combining
diacritic accent).
The background email message has a confusing title - this is not a UTF-8 issue.
Also beware the email is best viewed as plain text to show the difference of
between é and e followed by ́.
RIOT uses the Java platform decoder for UTF-8.
The validation warning parsing "DAWG-Final/i18n/normalization-02.ttl" is
unrelated. The message is correct (look at the input URI). It's outputing the
post-resolution IRI.
> RIOT not parsing UTF combining characters correctly
> ---------------------------------------------------
>
> Key: JENA-129
> URL: https://issues.apache.org/jira/browse/JENA-129
> Project: Jena
> Issue Type: Bug
> Components: RIOT
> Environment: Java 1.6, Windows 7, ARQ 2.8.8
> Reporter: Tim Harsch
> Labels: RIOT
>
> Background on the issue can be found at the list archive:
> http://mail-archives.apache.org/mod_mbox/incubator-jena-users/201110.mbox/%[email protected]%3E
> RIOT failed to parse the SPARQL 1.0 DAWG test: "i18n/normalization-01.ttl".
> In offline email Andy also noted:
> I see one oddity:
> [[
> ==== DAWG-Final/i18n/normalization-02.ttl
> WARN [line: 7, col: 8 ] Bad IRI: <eXAMPLE://a/b/%63/%7bfoo%7d#xyz>
> Code: 8/NON_INITIAL_DOT_SEGMENT in PATH: The path contains a segment
> /../ not at the beginning of a relative reference, or it contains a /./
> These should be removed.
> ]]
> because the test is on the input, not the resultant form used for toString.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira