Re: [DISCUSS] Calcite as SQL translator, and dialect testing

Yanjing Wang Wed, 27 Nov 2024 01:02:57 -0800

Hello, this discussion has been ongoing for a week. Let's move it forward.
Does anyone else have any suggestions?


Yanjing Wang <zhuangzixiao...@gmail.com> 于2024年11月19日周二 14:47写道：

> 1. RelToSqlConverterTest
> the class name implies tests conversion from RelNode to SQL, but now its
> RelNode comes from different dialects with target sql. it is difficult for
> me to understand the test case
>
> @Test void testNullCollation() {
>   final String query = "select * from \"product\" order by \"brand_name\"";
>   final String expected = "SELECT *\n"
>       + "FROM \"foodmart\".\"product\"\n"
>       + "ORDER BY \"brand_name\"";
>   final String sparkExpected = "SELECT *\n"
>       + "FROM `foodmart`.`product`\n"
>       + "ORDER BY `brand_name` NULLS LAST";
>   sql(query)
>       .withPresto().ok(expected)
>       .withSpark().ok(sparkExpected);
> }
>
>
> Why does the spark sql have 'NULLS LAST' in the end? the information is
> missing if we don't add source rel or source dialect.
>
> 2. Dialect-to-dialect translation
> I think it's necessary, dialect translation and materialized view
> substitution are common in big data domain, it would be beneficial to make
> Calcite more user-friendly for these scenarios.
> Could we create end-to-end test cases that start with the source SQL of
> one dialect and end with the target SQL of another (or the same) dialect?
> We could also include user-defined materialized views in the process and
> perform result comparison.
>
> Julian Hyde <jhyde.apa...@gmail.com> 于2024年11月19日周二 07:21写道：
>
>> A recent case, https://issues.apache.org/jira/browse/CALCITE-6693, "Add
>> Source SQL Dialect to RelToSqlConverterTest”, implies that people are using
>> Calcite to translate SQL from dialect to another. The test wanted to test
>> translating a SQL string from Presto to Redshift. I pushed back on that
>> case (and its PR) because that test is for translating RelNode to a SQL
>> dialect, not about handling source dialects.
>>
>> Dialect-to-dialect translation is undoubtedly something that people do
>> with Calcite. I think we should recognize that fact, and document how
>> someone can use Calcite as a translator. When we have documented it, we can
>> also add some tests.
>>
>> I am also worried about our dialect tests in general. The surface area to
>> be tested is huge, and the tests are added haphazardly, so while many cases
>> are tested there is a much greater set of cases that are not tested.
>> Consider, for example, how testSelectQueryWithGroupByEmpty [1] tests
>> against MySQL, Presto, StarRocks but not against BigQuery, Snowflake or
>> Postgres. If we want our SQL dialect support to be high quality, we have to
>> find a way to improve the coverage of our tests. I logged
>> https://issues.apache.org/jira/browse/CALCITE-5529 with some ideas but I
>> need help implement it.
>>
>> Julian
>>
>> [1]
>> https://github.com/apache/calcite/blob/f2ec11fe7e23ecf2db903bc02c40609242993aad/core/src/test/java/org/apache/calcite/rel/rel2sql/RelToSqlConverterTest.java#L577
>>
>>
>>

Re: [DISCUSS] Calcite as SQL translator, and dialect testing

Reply via email to