[ https://issues.apache.org/jira/browse/CALCITE-2273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457488#comment-16457488 ]
yuqi commented on CALCITE-2273: ------------------------------- As far as i see, unicode that have length more than 4 will be converted to two normal unicode with length of 4 U&'\01CCCC' will be covert to '\uD833\uDCCC'. > <Unicode 6 digit escape value> misinterpreted > ---------------------------------------------- > > Key: CALCITE-2273 > URL: https://issues.apache.org/jira/browse/CALCITE-2273 > Project: Calcite > Issue Type: Bug > Reporter: Zhong Yu > Assignee: Julian Hyde > Priority: Major > > > > The following string literal is accepted by Calcite parser, but the result > value in Java is incorrect > {code:java} > U&'\+01F600'{code} > > Cause: currently, SqlLiteral.unescapeUnicode() only intends to handle 4-digit > unicode escape value in the form of "\xyzw" . When given the 6-digit form > "\+xyzwrs", it parses the four chars "+xyz" as a hexadecimal, which succeeds > too. Therefore the result value contains 4 characters: [\u0xyz, w, r, s] -- This message was sent by Atlassian JIRA (v7.6.3#76005)