Ye Ding created CALCITE-2163:
--------------------------------
Summary: Using "UTF16" as default charset failed
Key: CALCITE-2163
URL: https://issues.apache.org/jira/browse/CALCITE-2163
Project: Calcite
Issue Type: Bug
Reporter: Ye Ding
Assignee: Julian Hyde
I have a project that need to handle non-ASCII character, so I have set default
charset to "UTF16" by setting "saffron.default.charset" to "UTF16", but failed
with below error stack
{code:txt}
Caused by: java.nio.charset.UnsupportedCharsetException: UTF-16
at org.apache.calcite.util.NlsString.<init>(NlsString.java:72)
at org.apache.calcite.rex.RexBuilder.makeLiteral(RexBuilder.java:882)
at org.apache.calcite.rex.RexBuilder.<init>(RexBuilder.java:117)
at
org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1046)
at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
... 29 more
{code}
Having explored related source code I found a suspicious code that may cause
the problem.
Here is a code block from RexBuilder, between L869 and L883.
{code:java}
case CHAR:
// Character literals must have a charset and collation. Populate
// from the type if necessary.
assert o instanceof NlsString;
NlsString nlsString = (NlsString) o;
if ((nlsString.getCollation() == null)
|| (nlsString.getCharset() == null)) {
assert type.getSqlTypeName() == SqlTypeName.CHAR;
assert type.getCharset().name() != null;
assert type.getCollation() != null;
o = new NlsString(
nlsString.getValue(),
type.getCharset().name(),
type.getCollation());
}
{code}
At the last line, a *Java* charset name is used to construct NlsString.
But from the code of NlsString's constructor, the charsetName is supposed to be
*SQL* charset name.
{code:java}
public NlsString(
String value,
String charsetName,
SqlCollation collation) {
assert value != null;
if (null != charsetName) {
charsetName = charsetName.toUpperCase(Locale.ROOT);
this.charsetName = charsetName;
String javaCharsetName =
SqlUtil.translateCharacterSetName(charsetName);
if (javaCharsetName == null) {
throw new UnsupportedCharsetException(charsetName);
}
this.charset = Charset.forName(javaCharsetName);
CharsetEncoder encoder = charset.newEncoder();
....
{code}
I have not read and fully understood codes, so I'm not sure if it's the root
cause of the problem. Currently I've managed to work around it by setting
"saffron.default.charset" to "UTF-16LE".
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)