[ 
https://issues.apache.org/jira/browse/CALCITE-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aryeh Hillman updated CALCITE-3933:
-----------------------------------
    Description: 
A string literal like "schön" should emit "schön" in SQL for most dialects 
(BigQuery, MySQL, Redshift and a few others), but instead emits
{code:java}
u&'sch\\00f6n' {code}
which is (ISO-8859-1 ASCII). 

It's possible that some of the above dialects may support ISO-8859, but in my 
tests with *BigQuery Standard SQL*, *MySQL*, and *Redshift* engines, the 
following fails:
{code:java}
select u&'sch\\00f6n';{code}
But this succeeds:
{code:java}
select 'schön'; {code}
Test that demonstrates (add to 
`org/apache/calcite/rel/rel2sql/RelToSqlConverterTest.java` and run from there):
{code:java}
@Test void testBigQueryUnicode() {
  final Function<RelBuilder, RelNode> relFn = b ->
          b.scan("EMP")
                  .filter(
                          b.call(SqlStdOperatorTable.IN, b.field("ENAME"),
                                  b.literal("schön")))
                  .build();
  final String expectedSql = "SELECT *\n"
          + "FROM \"scott\".\"EMP\"\n"
          + "WHERE \"ENAME\" IN ('schön')";
  relFn(relFn).withBigQuery().ok(expectedSql);
}
{code}

  was:
A string literal like "schön" should emit "schön" in SQL for most dialects 
(BigQuery, MySQL, Redshift and a few others), but instead emits 
"u&'sch\\00f6n'" (ISO-8859-1 ASCII).  It's possible that some dialects may 
support ISO-8859, but in my tests with BigQuery Standard SQL, MySQL, and 
Redshift engines, "select u&'sch\\00f6n';` fails but "select 'schön';` succeeds.

Test that demonstrates (add to 
`org/apache/calcite/rel/rel2sql/RelToSqlConverterTest.java` and run from there):
{code:java}
@Test void testBigQueryUnicode() {
  final Function<RelBuilder, RelNode> relFn = b ->
          b.scan("EMP")
                  .filter(
                          b.call(SqlStdOperatorTable.IN, b.field("ENAME"),
                                  b.literal("schön")))
                  .build();
  final String expectedSql = "SELECT *\n"
          + "FROM \"scott\".\"EMP\"\n"
          + "WHERE \"ENAME\" IN ('schön')";
  relFn(relFn).withBigQuery().ok(expectedSql);
}
{code}
  


> Incorrect SQL Emitted for Unicode for Several Dialects
> ------------------------------------------------------
>
>                 Key: CALCITE-3933
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3933
>             Project: Calcite
>          Issue Type: Bug
>    Affects Versions: 1.22.0
>         Environment: master with latest commit on April 15 (
> dfb842e55e1fa7037c8a731341010ed1c0cfb6f7)
>            Reporter: Aryeh Hillman
>            Priority: Major
>
> A string literal like "schön" should emit "schön" in SQL for most dialects 
> (BigQuery, MySQL, Redshift and a few others), but instead emits
> {code:java}
> u&'sch\\00f6n' {code}
> which is (ISO-8859-1 ASCII). 
> It's possible that some of the above dialects may support ISO-8859, but in my 
> tests with *BigQuery Standard SQL*, *MySQL*, and *Redshift* engines, the 
> following fails:
> {code:java}
> select u&'sch\\00f6n';{code}
> But this succeeds:
> {code:java}
> select 'schön'; {code}
> Test that demonstrates (add to 
> `org/apache/calcite/rel/rel2sql/RelToSqlConverterTest.java` and run from 
> there):
> {code:java}
> @Test void testBigQueryUnicode() {
>   final Function<RelBuilder, RelNode> relFn = b ->
>           b.scan("EMP")
>                   .filter(
>                           b.call(SqlStdOperatorTable.IN, b.field("ENAME"),
>                                   b.literal("schön")))
>                   .build();
>   final String expectedSql = "SELECT *\n"
>           + "FROM \"scott\".\"EMP\"\n"
>           + "WHERE \"ENAME\" IN ('schön')";
>   relFn(relFn).withBigQuery().ok(expectedSql);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to