[jira] [Commented] (TRAFODION-2477) Invalid characters in UCS2 to UTF8 translation are not handled correctly

Hans Zeller (JIRA) Wed, 01 Mar 2017 14:45:13 -0800

    [ 
https://issues.apache.org/jira/browse/TRAFODION-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15891232#comment-15891232
 ]


Hans Zeller commented on TRAFODION-2477:
----------------------------------------

Right now we don't use the correct replacement character for Unicode.

> Invalid characters in UCS2 to UTF8 translation are not handled correctly
> ------------------------------------------------------------------------
>
>                 Key: TRAFODION-2477
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-2477
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-cmp
>    Affects Versions: 2.0-incubating
>            Reporter: Hans Zeller
>            Assignee: Hans Zeller
>             Fix For: 2.2-incubating
>
>
> When translating from UCS-2 to UTF-8, using CAST or TRANSLATE(... 
> UCS2TOUTF8), all valid characters will map easily to a UTF-8 character. 
> However, if we encounter invalid code points or invalid UTF-16 surrogate 
> pairs, those could raise errors. Right now we just suppress those errors. 
> Instead we should either translate them to the Unicode "replacement 
> character" U+FFFD or we should raise an error. Ideally, we should have a CQD 
> that decides which of these two actions to take.
> Test case:
> create table tbaducs2(a char(10) character set ucs2);
> -- DC00 is a low-order UTF-16 surrogate, on its own this is invalid
> insert into tbaducs2 values(_ucs2 X'DC000041');
> select translate(a using ucs2toutf8) from tbaducs2;
> -- this returns an empty string - no error, no replacement character



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (TRAFODION-2477) Invalid characters in UCS2 to UTF8 translation are not handled correctly

Reply via email to