paul-rogers opened a new pull request, #13106:
URL: https://github.com/apache/druid/pull/13106

   As we move forward with the catalog and other projects, it has become clear 
we need a SQL query test structure which is a bit more flexible than the 
current "Calcite tests."
   
   ### Test Case File
   
   The revised framework, inspired by the one used by Apache Impala, uses "test 
case" file to define the test inputs and expected outputs. Each ".case" file 
contains one or more test cases, each of which has one or more sections. The 
file uses a somewhat unusual syntax to avoid the need to quote JSON or SQL. (It 
would seem to be more of a "Druid fit" to use JSON, but imagine the complexity 
of quoting SQL that contains JSON within a JSON-formatted file.)
   
   A typical test case looks like this:
   
   ```text
   ==============================================================
   Converted from testSelectConstantArrayExpressionFromTable()
   === case
   SELECT constant array expression from table
   === SQL
   SELECT ARRAY[1,2] as arr, dim1 FROM foo LIMIT 1
   === options
   sqlCompatibleNulls=both
   vectorize=true
   === schema
   arr INTEGER ARRAY
   dim1 VARCHAR
   === plan
   LogicalSort(fetch=[1])
     LogicalProject(arr=[ARRAY(1, 2)], dim1=[$2])
       LogicalTableScan(table=[[druid, foo]])
   === native
   {
     "queryType" : "scan",
     "dataSource" : {
       "type" : "table",
       "name" : "foo"
     },
     "intervals" : {
       "type" : "intervals",
       "intervals" : [ 
"-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z" ]
     },
     "virtualColumns" : [ {
       "type" : "expression",
       "name" : "v0",
       "expression" : "array(1,2)",
       "outputType" : "ARRAY<LONG>"
     } ],
     "resultFormat" : "compactedList",
     "limit" : 1,
     "columns" : [ "dim1", "v0" ],
     "legacy" : false,
     "granularity" : {
       "type" : "all"
     }
   }
   === results
   ["[1,2]",""]
   ```
   
   The structure has multiple sections as described by the `syntax.md` file in 
this PR. The primary sections are `sql`, which defines the query, and `results` 
which defines the expected output as a set of JSON lines. The other sections 
allow validating other parts of the query process such as the logical plan, the 
output schema, the Druid native query and so on.
   
   The `context` section specifies any additional query context items to 
include in the query. The `options` section provides instructions to the test 
framework itself, such as whether to run a query vectorized or not.
   
   ### Scope of This PR
   
   This PR includes only the test case definitions: both for for the expected 
values (parsed from a ".case" file), and an actual query run. This PR does not 
yet include the mechanisms to run the cases. That mechanism is dependent on 
various other in-flight PRs and will arrive in a future PR. This PR allows us 
to establish the case file format itself which can be used in both Calcite unit 
tests and new ITs.
   
   <hr>
   
   This PR has:
   
   - [X] been self-reviewed.
   - [X] added Javadocs for most classes and all non-trivial methods. Linked 
related entities via Javadoc links.
   - [X] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [X] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to