Hi mbudiu,thank you for bringing SqlLogicTest to calcite. Now we're working on an optimizer based on Calcite that supports Postgresql semantics as a whole. I am very interested in your bringing the Postgres test to Calcite and would like to participate in it. Could you please give me a complete collection of reference materials? The current document information is scattered, some in sql-logic-test github and some in sql-to-dbsp-compiler github, which requires everyone to read on github by themselves. If you could give me a complete end-to-end demo of adding tests, I would really appreciate it. Best, LakeShen
<[email protected]> 于2023年7月29日周六 01:04写道: > The correct link is > https://github.com/feldera/dbsp/tree/main/sql-to-dbsp-compiler, but it's > not particularly important, I apologize for the broken one. > > I appreciate you pointing out the Sql Logic Test project, I wrote that > code too. But these two projects are almost entirely disjoint. SLT brings a > few million pre-written tests. But none of the SLT tests exercises any of > the SQL functions. So you need to write many more tests for each *function* > implemented, and most contributions to Calcite lately are functions in > different dialects. > > Moreover, while SLT found quite a few bugs, I only had time to file one of > them so far. For each failing test I have to figure out (1) whether it's a > duplicate, (2) to create a minimal reproduction, (3) to figure out which > part of the compiler is faulty in order to understand where to insert the > test, (4) create a reproduction tailored for the faulty component. > Reproductions for planner bugs are different than reproductions for library > bugs. But some bugs happen only if you combine some planner rules with some > functions. > > With the proposal below you only need a CLI to a database to generate new > tests (e.g., BigQuery): you execute write the queries in the CLI, then > copy-paste the output into a test. This makes it easier to file bugs, but > harder to diagnose them. But it also makes it much easier to write lots of > tests, because it enables people without Calcite expertise to write the > tests. > > Mihai > > -----Original Message----- > From: stanilovsky evgeny > Sent: Friday, July 28, 2023 2:43 AM > To: [email protected] > Subject: Re: Easier and more comprehensive testing > > Hello, your github link doesn`t open. > plz check discussion here in dev list titled: > > Running Sql Logic Tests for Calcite > This is the JIRA case: https://issues.apache.org/jira/browse/CALCITE-5615 > And this is the PR: https://github.com/apache/calcite/pull/3145 > > > > Hello, > > > > > > I am working to test our calcite-based compiler > > (https://github.com/feldera/dbsp/sql-to-dbsp-compiler), and since I am > > lazy I am borrowing tests from other open-source test suites, like > > Postgres. > > > > I am finding bugs in Calcite with a relatively high frequency, as you > > may have noticed if you follow the JIRA. I would say that one in 3 SQL > > functions I test turns out to have some problems. So our techniques > > seem to be effective at finding bugs. > > > > I think that some of the techniques we are using could be applied to > > Calcite as well: > > > > > > * First, we run all tests with and without the optimizer. In > > particular, we go through all the "constant evaluation" rules of Calcite. > > For constant expressions the results should be identical with and > > without optimizer. They aren't always, and we have found quite a few > > cases where the compile-time evaluation crashes or produces wrong > > results. So the compile-time and the run-time evaluator cross-check > > each other. > > * Second, we have tried to make it very easy to write end-to-end > > tests, at least positive tests (which are supposed to return a value > > rather than an error). Here is a stylized excerpt from our testing > > code: > > > > > > @Override > > > > public void prepareData() { > > > > this.executeStatements("CREATE TABLE FLOAT4_TBL (f1 > float4);\n" > > + > > > > "INSERT INTO FLOAT4_TBL(f1) VALUES (' 0.0');\n" + > > > > "INSERT INTO FLOAT4_TBL(f1) VALUES ('1004.30 ');\n" + > > > > "INSERT INTO FLOAT4_TBL(f1) VALUES (' -34.84 > > ');\n" + > > > > "INSERT INTO FLOAT4_TBL(f1) VALUES > > ('1.2345678901234e+20');\n" + > > > > "INSERT INTO FLOAT4_TBL(f1) VALUES > > ('1.2345678901234e-20');"); > > > > } > > > > > > @Test > > > > public void testFPArithmetic() { > > > > this.qs("SELECT f.f1, f.f1 * '-10' AS x FROM FLOAT4_TBL f\n" + > > > > " WHERE f.f1 > '0.0';\n" + > > > > " f1 | x \n" + > > > > "---------------+----------------\n" + > > > > " 1004.3 | -10043\n" + > > > > " 1.2345679e+20 | -1.2345678e+21\n" + > > > > " 1.2345679e-20 | -1.2345678e-19\n" + > > > > "(3 rows)\n" + > > > > "\n" + > > > > "SELECT f.f1, f.f1 + '-10' AS x FROM FLOAT4_TBL f\n" + > > > > " WHERE f.f1 > '0.0';\n" + > > > > " f1 | x \n" + > > > > "---------------+---------------\n" + > > > > " 1004.3 | 994.3\n" + > > > > " 1.2345679e+20 | 1.2345679e+20\n" + > > > > " 1.2345679e-20 | -10\n" + > > > > "(3 rows)\n" + > > > > "\n"); > > > > } > > > > > > These long strings are an almost direct copy-and-paste from tests in > > https://github.com/postgres/postgres/blob/03734a7fed7d924679770adb78a7 > > db8a37 d14188/src/test/regress/expected/float4.out (but they have to > > manually checked, since Calcite does often not match Postgres > > behaviors). A string can contain multiple query-expected result > > strings. > > > > > > The point is that the barrier for writing the tests is quite low. > > > > Granted, this approach has some weaknesses as well, in particular, we > > rely on a specific output format and is brittle in some respects > > (e.g., spaces in output strings). > > > > > > I have seen some *.iq files as resources in the Calcite source code, > > are these used in the same way for testing in Calcite? If yes, that's > > great, and there should be many more. > > > > > > Calcite has lots of unit tests, but I find that our "end-to-end" tests > > are easier to write and have much better code coverage. For example, > > for each failure I find in Calcite I have to dig pretty hard to figure > > out which of the testing files should contain my reproduction (e.g., > > SqlOperatorTest, RelOptRulesTest, RelToSqlConverterTest, etc.), and > > how to exactly write the reproduction (some reproductions require > > editing some huge XML files too). > > That takes a lot of time. > > > > > > I am not necessarily signing up to build this infrastructure, at least > > not right away. I am not sure how much of the stuff I wrote for our > > compiler could be ported directly to Calcite. But I will think about > > it, and I will gladly help this effort. > > > > > > I appreciate any comments and suggestions, > > > > Mihai > >
