Hello all,

I have submitted a PR for Calcite with a standalone executable that runs
the Sql Logic Test suite of 7+ million tests from sqlite.

This is the JIRA case: https://issues.apache.org/jira/browse/CALCITE-5615
And this is the PR: https://github.com/apache/calcite/pull/3145

As Stamatis pointed out, the PR isn't really specific to Calcite, it is a
general framework in Java to run these tests on any JDBC compliant
executor. So a question is whether this belongs to the Calcite project, or
some place else. sqlite is a C project, I didn't see any Java in their
source tree.

Please note that SQLite is in the public domain, so their licensing terms
are not an obstacle to using the test scripts.

The submitted code runs Calcite in its default configuration, but the
intent is for other projects that build Calcite-based compilers to be able
to test them by subclassing the "TestExecutors". In our own project (
https://github.com/vmware/sql-to-dbsp-compiler) we have done exactly that,
and we are not using the JDBC API.

The testsuite does find bugs in Calcite, both crashes and incorrect
results. So I think it's usefulness is not debated.

The second question is about the packaging of this program; right now it
has a main() entry point and it prints the results to stderr for human
consumption and triage. It is not clear to me how it should be inserted in
a CI infrastructure, since running all 7 million tests could take a long
time. One possible extension would be to have the program generate a
regression test for Calcite for each bug it finds, but I haven't
implemented this feature yet (and many failures could be due to the same
bug). But even that mode would not naturally integrate in a CI
infrastructure.

A simple possibility is for me to just publish the code as an independent
project on github with an MIT license (the code is derived from our
MIT-licensed project) and just advertise it to the Calcite community.

I would very much appreciate guidance.

Mihai Budiu

Reply via email to