Ye Ding created CALCITE-2163: -------------------------------- Summary: Using "UTF16" as default charset failed Key: CALCITE-2163 URL: https://issues.apache.org/jira/browse/CALCITE-2163 Project: Calcite Issue Type: Bug Reporter: Ye Ding Assignee: Julian Hyde
I have a project that need to handle non-ASCII character, so I have set default charset to "UTF16" by setting "saffron.default.charset" to "UTF16", but failed with below error stack {code:txt} Caused by: java.nio.charset.UnsupportedCharsetException: UTF-16 at org.apache.calcite.util.NlsString.<init>(NlsString.java:72) at org.apache.calcite.rex.RexBuilder.makeLiteral(RexBuilder.java:882) at org.apache.calcite.rex.RexBuilder.<init>(RexBuilder.java:117) at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1046) at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154) ... 29 more {code} Having explored related source code I found a suspicious code that may cause the problem. Here is a code block from RexBuilder, between L869 and L883. {code:java} case CHAR: // Character literals must have a charset and collation. Populate // from the type if necessary. assert o instanceof NlsString; NlsString nlsString = (NlsString) o; if ((nlsString.getCollation() == null) || (nlsString.getCharset() == null)) { assert type.getSqlTypeName() == SqlTypeName.CHAR; assert type.getCharset().name() != null; assert type.getCollation() != null; o = new NlsString( nlsString.getValue(), type.getCharset().name(), type.getCollation()); } {code} At the last line, a *Java* charset name is used to construct NlsString. But from the code of NlsString's constructor, the charsetName is supposed to be *SQL* charset name. {code:java} public NlsString( String value, String charsetName, SqlCollation collation) { assert value != null; if (null != charsetName) { charsetName = charsetName.toUpperCase(Locale.ROOT); this.charsetName = charsetName; String javaCharsetName = SqlUtil.translateCharacterSetName(charsetName); if (javaCharsetName == null) { throw new UnsupportedCharsetException(charsetName); } this.charset = Charset.forName(javaCharsetName); CharsetEncoder encoder = charset.newEncoder(); .... {code} I have not read and fully understood codes, so I'm not sure if it's the root cause of the problem. Currently I've managed to work around it by setting "saffron.default.charset" to "UTF-16LE". -- This message was sent by Atlassian JIRA (v7.6.3#76005)