They are known issues. Calcite isn't doing well on join reordering, there are 
multiple reasons contributing to the inefficiency. Try to turn off the rule 
that does join associativity.

- Haisheng

------------------------------------------------------------------
发件人:Boyan Kolev<[email protected]>
日 期:2020年04月01日 23:31:06
收件人:<[email protected]>
主 题:Query planning takes forever on TPC-H queries #5 and #7

Hello,

I am running a TPC-H benchmark on a database system that uses Calcite as query 
engine.
However, two of the TPC-H queries (please see attached Query05.sql and 
Query07.sql) are failing to get planned by Calcite.
After waiting one hour for the result of the EXPLAIN, I simply interrupted.

To isolate the issues from our specific environment, I was able to reproduce 
them with Calcite alone and a set of 8 CSV files (only headers + one line of 
data) that defines the TPC-H schema.

It is easy to reproduce on a clean Calcite 1.22.0 deployment, by copying the 
attached tpchmodel.json and unzipping the attached tpch.zip into 
calcite/example/csv/src/test/resources.
Then, as in the tutorial, run the queries with ./sqlline launched from the 
calcite/example/csv folder and connected with:
!connect jdbc:calcite:model=src/test/resources/tpchmodel.json admin admin

My observations so far on the complexity of the queries are that:

- Query #5 has a cycle in the join conditions. Removing the condition "and 
s_nationkey = c_nationkey" destroys the cycle and makes the query buildable. 
The cycle, however, is not a problem by itself - keeping the cycle, but 
removing the table "region" and the associated 2 where clause conditions, also 
makes the query planning happen.

- Query #7 contains a disjunctive predicate. Without any of the sides of the OR 
expression, the query compiles without any problem.

Thanks in advance for any help and clarity on these issues!

Sincerely,
--
Boyan Kolev


Reply via email to