Re: [DISCUSS] Calcite as SQL translator, and dialect testing

Yanjing Wang Mon, 18 Nov 2024 22:57:58 -0800

1. RelToSqlConverterTest
the class name implies tests conversion from RelNode to SQL, but now its
RelNode comes from different dialects with target sql. it is difficult for
me to understand the test case


@Test void testNullCollation() {
  final String query = "select * from \"product\" order by \"brand_name\"";
  final String expected = "SELECT *\n"
      + "FROM \"foodmart\".\"product\"\n"
      + "ORDER BY \"brand_name\"";
  final String sparkExpected = "SELECT *\n"
      + "FROM `foodmart`.`product`\n"
      + "ORDER BY `brand_name` NULLS LAST";
  sql(query)
      .withPresto().ok(expected)
      .withSpark().ok(sparkExpected);
}


Why does the spark sql have 'NULLS LAST' in the end? the information is
missing if we don't add source rel or source dialect.

2. Dialect-to-dialect translation
I think it's necessary, dialect translation and materialized view
substitution are common in big data domain, it would be beneficial to make
Calcite more user-friendly for these scenarios.
Could we create end-to-end test cases that start with the source SQL of one
dialect and end with the target SQL of another (or the same) dialect? We
could also include user-defined materialized views in the process and
perform result comparison.

Julian Hyde <jhyde.apa...@gmail.com> 于2024年11月19日周二 07:21写道：

> A recent case, https://issues.apache.org/jira/browse/CALCITE-6693, "Add
> Source SQL Dialect to RelToSqlConverterTest”, implies that people are using
> Calcite to translate SQL from dialect to another. The test wanted to test
> translating a SQL string from Presto to Redshift. I pushed back on that
> case (and its PR) because that test is for translating RelNode to a SQL
> dialect, not about handling source dialects.
>
> Dialect-to-dialect translation is undoubtedly something that people do
> with Calcite. I think we should recognize that fact, and document how
> someone can use Calcite as a translator. When we have documented it, we can
> also add some tests.
>
> I am also worried about our dialect tests in general. The surface area to
> be tested is huge, and the tests are added haphazardly, so while many cases
> are tested there is a much greater set of cases that are not tested.
> Consider, for example, how testSelectQueryWithGroupByEmpty [1] tests
> against MySQL, Presto, StarRocks but not against BigQuery, Snowflake or
> Postgres. If we want our SQL dialect support to be high quality, we have to
> find a way to improve the coverage of our tests. I logged
> https://issues.apache.org/jira/browse/CALCITE-5529 with some ideas but I
> need help implement it.
>
> Julian
>
> [1]
> https://github.com/apache/calcite/blob/f2ec11fe7e23ecf2db903bc02c40609242993aad/core/src/test/java/org/apache/calcite/rel/rel2sql/RelToSqlConverterTest.java#L577
>
>
>

Re: [DISCUSS] Calcite as SQL translator, and dialect testing

Reply via email to