Hi,
We were working on a problem to detect if materialized view can be used to
rewrite a query in non-trivial cases. Will briefly describe the problem and
approach below and would appreciate feedback on the same.
Motivation
---------------
For instance there exists a table "logs" and a partition (materialized
view) named "log_after_01_Jan" created on it and described by SQL :
"Select * from logs where log_date > '01-01-2015' ".
Assume that the table "log_after_01_Jan" is much smaller than table "logs".
For user query:
"Select log_body from logs where log_date > '03-03-2015' and
char_length(log_body) < 100",
we should detect that the materialized view "log_after_01_Jan" can be used
and transform the query into:
"Select log_body from log_after_01_Jan where log_date > '03-03-2015' and
char_length(log_body) < 100"
Approach
--------------
One of the fundamental problems we would come across here is to check if a
boolean condition X implies (=>) Y. This quickly reduces to the
Satisfiability problem which is NP complete for propositional logic. But
there are many instances like above which can be detected easily. We have
implemented an approach to handle several useful cases for few operators
and types of operands. Will be extending it further for more types of
operations.
Top Level approach:
1. Currently, VolcanoPlanner:useMaterialization tries to rewrite original
query using MV using SubstitutionVisitor. Have extended SubstitutionVisitor
to detect above cases and do the substitution.
2. To check if a condition X => Y,
a. Convert both of them into Disjunctive Normal Form.
Say X is transformed into x_1 or x_2 or x_3 ... or x_m and
Y is transformed into y_1 or y_2 ,... or y_i, where any x_i and y_i
are conjunctions of atomic predicates.
For instance condition "(a>10 or b>20) and c <90" will be converted
to DNF: (a>10 and c<90) or (b>20 and c<90).
b. For X=>Y to be a tautology i.e., hold always true, every conjunction
x_i should imply atleast one of the conjunction y_j.
We wrote some set of simple heuristics to check if a conjunction of
atomic predicates implies another.
This also involves executing RexNode using RexImplExecutor.
We have checked in code for this in our fork of
calcite(qubole/incubator-calcite). This is ongoing work and we will be
making many more improvements to it. If this is useful or anybody is
interested in giving feedback then I can share the commit so that we can
discuss about it and take it forward.
Regards,
Amogh
Member of Technical Staff
Qubole