[
https://issues.apache.org/jira/browse/CALCITE-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aryeh Hillman updated CALCITE-3933:
-----------------------------------
Description:
A string literal like "schön" should emit "schön" in SQL for most dialects
(BigQuery, MySQL, Redshift and a few others), but instead emits
{code:java}
u&'sch\\00f6n' {code}
which is (ISO-8859-1 ASCII).
It's possible that some of the above dialects may support ISO-8859, but in my
tests with *BigQuery Standard SQL*, *MySQL*, and *Redshift* engines, the
following fails:
{code:java}
select u&'sch\\00f6n';{code}
But this succeeds:
{code:java}
select 'schön'; {code}
Test that demonstrates (add to
`org/apache/calcite/rel/rel2sql/RelToSqlConverterTest.java` and run from there):
{code:java}
@Test void testBigQueryUnicode() {
final Function<RelBuilder, RelNode> relFn = b ->
b.scan("EMP")
.filter(
b.call(SqlStdOperatorTable.IN, b.field("ENAME"),
b.literal("schön")))
.build();
final String expectedSql = "SELECT *\n"
+ "FROM \"scott\".\"EMP\"\n"
+ "WHERE \"ENAME\" IN ('schön')";
relFn(relFn).withBigQuery().ok(expectedSql);
}
{code}
was:
A string literal like "schön" should emit "schön" in SQL for most dialects
(BigQuery, MySQL, Redshift and a few others), but instead emits
"u&'sch\\00f6n'" (ISO-8859-1 ASCII). It's possible that some dialects may
support ISO-8859, but in my tests with BigQuery Standard SQL, MySQL, and
Redshift engines, "select u&'sch\\00f6n';` fails but "select 'schön';` succeeds.
Test that demonstrates (add to
`org/apache/calcite/rel/rel2sql/RelToSqlConverterTest.java` and run from there):
{code:java}
@Test void testBigQueryUnicode() {
final Function<RelBuilder, RelNode> relFn = b ->
b.scan("EMP")
.filter(
b.call(SqlStdOperatorTable.IN, b.field("ENAME"),
b.literal("schön")))
.build();
final String expectedSql = "SELECT *\n"
+ "FROM \"scott\".\"EMP\"\n"
+ "WHERE \"ENAME\" IN ('schön')";
relFn(relFn).withBigQuery().ok(expectedSql);
}
{code}
> Incorrect SQL Emitted for Unicode for Several Dialects
> ------------------------------------------------------
>
> Key: CALCITE-3933
> URL: https://issues.apache.org/jira/browse/CALCITE-3933
> Project: Calcite
> Issue Type: Bug
> Affects Versions: 1.22.0
> Environment: master with latest commit on April 15 (
> dfb842e55e1fa7037c8a731341010ed1c0cfb6f7)
> Reporter: Aryeh Hillman
> Priority: Major
>
> A string literal like "schön" should emit "schön" in SQL for most dialects
> (BigQuery, MySQL, Redshift and a few others), but instead emits
> {code:java}
> u&'sch\\00f6n' {code}
> which is (ISO-8859-1 ASCII).
> It's possible that some of the above dialects may support ISO-8859, but in my
> tests with *BigQuery Standard SQL*, *MySQL*, and *Redshift* engines, the
> following fails:
> {code:java}
> select u&'sch\\00f6n';{code}
> But this succeeds:
> {code:java}
> select 'schön'; {code}
> Test that demonstrates (add to
> `org/apache/calcite/rel/rel2sql/RelToSqlConverterTest.java` and run from
> there):
> {code:java}
> @Test void testBigQueryUnicode() {
> final Function<RelBuilder, RelNode> relFn = b ->
> b.scan("EMP")
> .filter(
> b.call(SqlStdOperatorTable.IN, b.field("ENAME"),
> b.literal("schön")))
> .build();
> final String expectedSql = "SELECT *\n"
> + "FROM \"scott\".\"EMP\"\n"
> + "WHERE \"ENAME\" IN ('schön')";
> relFn(relFn).withBigQuery().ok(expectedSql);
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)