Currently Ignite doesn't have an ability to detect SQL performance
regressions between different versions. We have a Yardstick benchmark
module, but it has several drawbacks:
- it doesn't compare different Ignite versions
- it doesn't check the query result
- it doesn't have an ability to execute randomized SQL queries (aka
fuzzy testing)

So, Yardstick is not very helpful for detecting SQL performance regressions.

I think we need a brand-new framework for this task and I propose to
implement it by adopting the ideas taken from the Apollo tool paper [1].
The Apollo tool pipeline works like like this:

1. Apollo start two different versions of databases simultaneously.
2. Then Apollo populates them with the same dataset
3. Apollo generates random SQL queries using external library (i.e.
SQLSmith [2])
4. Each query is executed in both database versions. Execution time is
measured by the framework.
5. If the execution time difference for the same query exceeds some
threshold (say, 2x slower), the query is logged.
6. Apollo then tries to simplify the problematic queries in order to
obtain the minimal reproducer.
7. Apollo also has an ability to automatically perform git history
binary search to find the bad commit
8. It also can localize a root cause of the regression by carrying out
the statistical debugging.

I think we don't have to implement all these Apollo steps. First 4 steps
will be enough for our needs.

My proposal is to create a new module called 'sql-testing'. We need a
separate module because it should be suitable for both query engines:
H2-based and upcoming Calcite-based. This module will contain a test
suite which works in the following way:
1. It starts two Ignite clusters with different versions (current
version and the previous release version).
2. Framework then runs randomly generated queries in both clusters and
checks the execution time for each cluster. We need to port SQLSmith [2]
library from C++ to java for this step. But initially we can start with
some set of hardcoded queries and postpone the SQLSmith port. Randomized
queries can be added later.
3. All problematic queries are then reported as performance issues. In
this way we can manually examine the problems.

This tool will bring a certain amount of robustness to our SQL layer as
well as some portion of confidence in absence of SQL query regressions.

What do you think?

[1] http://www.vldb.org/pvldb/vol13/p57-jung.pdf
[2] https://github.com/anse1/sqlsmith

Kind Regards
Roman Kondakov

