Hey Mihai,

Thanks for starting this discussion!

Let's focus on the first question for now:

Q1: Should the new slt module under PR-3145 [1] become part of Calcite
repo or get its own?

For those who have not followed the discussion under the CALCITE-5615
[2] let me try to summarize a few things as per my understanding;
Mihai can amend/correct things if necessary.

The new slt module resembles a port of sqllogictest utility [3] to
Java. It can parse and understand the test-script format used in
sqllogictest and can run this scripts over JDBC compliant databases.
It also accounts for extensions for Java engines without a JDBC
interface.

>From my perspective, the code in [1] could perfectly stand on its own
in a separate repo; there are already ports of sqllogictest in other
languages such as Rust [4] and the latter appears to be quite popular.
The sqllocitest parser/runner presents some similarities with the
Quidem [5] executor that we are using for certain tests in Calcite.
The Quidem project has its own repo although we are making use of it
in Calcite.
If it becomes a separate repo then the test scripts could also become
part of the project making it more self-contained.

On the other hand, we already have a testkit module in Calcite so
bringing in new modules for testing purposes is relevant so why not
slt as well. If it becomes part of Calcite it can get more visibility
and facilitate maintenance since more people would be able to review
and merge changes (not only Mihai).

Since we are talking about a new module I would like to see some more
people share their opinion on the topic before I continue the review.

Best,
Stamatis

[1] https://github.com/apache/calcite/pull/3145
[2] https://issues.apache.org/jira/browse/CALCITE-5615
[3] https://www.sqlite.org/sqllogictest/doc/trunk/about.wiki
[4] https://github.com/risinglightdb/sqllogictest-rs
[5] https://github.com/julianhyde/quidem



On Sat, Apr 15, 2023 at 11:31 AM Michael Mior <[email protected]> wrote:
>
> Very cool! One approach could be to add set these tests to run periodically
> (daily/weekly) as opposed to being part of the CI pipeline. That way we
> still have a mechanism to keep tabs on bugs but the whole build isn't
> slow/broken until this is fixed.
>
> On Fri, Apr 14, 2023, 15:20 Mihai Budiu <[email protected]> wrote:
>
> > Hello all,
> >
> > I have submitted a PR for Calcite with a standalone executable that runs
> > the Sql Logic Test suite of 7+ million tests from sqlite.
> >
> > This is the JIRA case: https://issues.apache.org/jira/browse/CALCITE-5615
> > And this is the PR: https://github.com/apache/calcite/pull/3145
> >
> > As Stamatis pointed out, the PR isn't really specific to Calcite, it is a
> > general framework in Java to run these tests on any JDBC compliant
> > executor. So a question is whether this belongs to the Calcite project, or
> > some place else. sqlite is a C project, I didn't see any Java in their
> > source tree.
> >
> > Please note that SQLite is in the public domain, so their licensing terms
> > are not an obstacle to using the test scripts.
> >
> > The submitted code runs Calcite in its default configuration, but the
> > intent is for other projects that build Calcite-based compilers to be able
> > to test them by subclassing the "TestExecutors". In our own project (
> > https://github.com/vmware/sql-to-dbsp-compiler) we have done exactly that,
> > and we are not using the JDBC API.
> >
> > The testsuite does find bugs in Calcite, both crashes and incorrect
> > results. So I think it's usefulness is not debated.
> >
> > The second question is about the packaging of this program; right now it
> > has a main() entry point and it prints the results to stderr for human
> > consumption and triage. It is not clear to me how it should be inserted in
> > a CI infrastructure, since running all 7 million tests could take a long
> > time. One possible extension would be to have the program generate a
> > regression test for Calcite for each bug it finds, but I haven't
> > implemented this feature yet (and many failures could be due to the same
> > bug). But even that mode would not naturally integrate in a CI
> > infrastructure.
> >
> > A simple possibility is for me to just publish the code as an independent
> > project on github with an MIT license (the code is derived from our
> > MIT-licensed project) and just advertise it to the Calcite community.
> >
> > I would very much appreciate guidance.
> >
> > Mihai Budiu
> >

Reply via email to