Hello. I am trying to implement a planner in order to generate optimal
logical query plans using some statistics I provide to the schema.
Currently, the only available statistics is the number of rows of each
table.
I am using HepPlanner. My actual problem is that when the *findBestExp()*
is called, the resulting plan is not optimized. That is, the query is just
parsed and the join order is the same as the one I provide in the input
query, neither filter push downs are being applied.
For example, for the query
"SELECT * FROM ftable f, products p WHERE f.id = p.pid AND p.pid = 2"
the resulting plan is:
12:LogicalProject(id=[$0], desc=[$1], price=[$2], loc=[$3], pid=[$4],
pdesc=[$5]): rowcount = 225000.0, cumulative cost = 1.05002E7
10:LogicalFilter(condition=[AND(=($0, $4), =($4, 2))]): rowcount =
225000.0, cumulative cost = 1.02752E7
8:LogicalJoin(condition=[true], joinType=[inner]): rowcount = 1.0E7,
cumulative cost = 1.00502E7
0:EnumerableTableScan(table=[[fTable]]): rowcount = 50000.0,
cumulative cost = 50000.0
1:EnumerableTableScan(table=[[products]]): rowcount = 200.0,
cumulative cost = 200.0
I implemented this using Hive's TestCBORuleFiredOnlyOnce.java
<https://github.com/apache/hive/blob/48b201ee163252b2127ce04fbf660df70312888a/ql/src/test/org/apache/hadoop/hive/ql/optimizer/calcite/TestCBORuleFiredOnlyOnce.java>
and PlannerImpl.java
<https://github.com/apache/calcite/blob/5323d8d48baa2d7bc8dea8b03bc0bda93563e0f9/core/src/main/java/org/apache/calcite/prepare/PlannerImpl.java>
as examples and there are some classes or overrided methods which I
currently use as “black boxes”. Here is a link with the code of my basic
class: http://pastebin.com/HysfNa8S.
--
Victor Giannakouris - Salalidis
LinkedIn:
http://gr.linkedin.com/pub/victor-giannakouris-salalidis/69/585/b23/
Personal Page: http://gsvic.github.io