paul-rogers opened a new pull request, #13106:
URL: https://github.com/apache/druid/pull/13106
As we move forward with the catalog and other projects, it has become clear
we need a SQL query test structure which is a bit more flexible than the
current "Calcite tests."
### Test Case File
The revised framework, inspired by the one used by Apache Impala, uses "test
case" file to define the test inputs and expected outputs. Each ".case" file
contains one or more test cases, each of which has one or more sections. The
file uses a somewhat unusual syntax to avoid the need to quote JSON or SQL. (It
would seem to be more of a "Druid fit" to use JSON, but imagine the complexity
of quoting SQL that contains JSON within a JSON-formatted file.)
A typical test case looks like this:
```text
==============================================================
Converted from testSelectConstantArrayExpressionFromTable()
=== case
SELECT constant array expression from table
=== SQL
SELECT ARRAY[1,2] as arr, dim1 FROM foo LIMIT 1
=== options
sqlCompatibleNulls=both
vectorize=true
=== schema
arr INTEGER ARRAY
dim1 VARCHAR
=== plan
LogicalSort(fetch=[1])
LogicalProject(arr=[ARRAY(1, 2)], dim1=[$2])
LogicalTableScan(table=[[druid, foo]])
=== native
{
"queryType" : "scan",
"dataSource" : {
"type" : "table",
"name" : "foo"
},
"intervals" : {
"type" : "intervals",
"intervals" : [
"-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z" ]
},
"virtualColumns" : [ {
"type" : "expression",
"name" : "v0",
"expression" : "array(1,2)",
"outputType" : "ARRAY<LONG>"
} ],
"resultFormat" : "compactedList",
"limit" : 1,
"columns" : [ "dim1", "v0" ],
"legacy" : false,
"granularity" : {
"type" : "all"
}
}
=== results
["[1,2]",""]
```
The structure has multiple sections as described by the `syntax.md` file in
this PR. The primary sections are `sql`, which defines the query, and `results`
which defines the expected output as a set of JSON lines. The other sections
allow validating other parts of the query process such as the logical plan, the
output schema, the Druid native query and so on.
The `context` section specifies any additional query context items to
include in the query. The `options` section provides instructions to the test
framework itself, such as whether to run a query vectorized or not.
### Scope of This PR
This PR includes only the test case definitions: both for for the expected
values (parsed from a ".case" file), and an actual query run. This PR does not
yet include the mechanisms to run the cases. That mechanism is dependent on
various other in-flight PRs and will arrive in a future PR. This PR allows us
to establish the case file format itself which can be used in both Calcite unit
tests and new ITs.
<hr>
This PR has:
- [X] been self-reviewed.
- [X] added Javadocs for most classes and all non-trivial methods. Linked
related entities via Javadoc links.
- [X] added comments explaining the "why" and the intent of the code
wherever would not be obvious for an unfamiliar reader.
- [X] added unit tests or modified existing tests to cover new code paths,
ensuring the threshold for [code
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
is met.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]