Amogh,

   It would be nice if you can share the commit id.


1. #2.b "This also involves executing RexNode using RexImplExecutor²
Wondering why expr needs to be evaluated? Is it cover functions involving
literals.

2. how is expr ordering issue solved?
Example: dept != 10 vs 10 != dept

3. Column aliasing:
Select sales as a from t1 order by a
Vs
Select sales as b from t1  order by b

Thanks
John

On 6/25/15, 12:35 PM, "Julian Hyde" <[email protected]> wrote:

>This is great work. Certainly consistent with where I am heading.
>
>I would not be inclined to use DNF (because of its tendency to inflate
>certain predicates) but if you are able to get something effective I
>will happily use it. I think you should package it behind a method --
>"find out what is left to satisfy p when you have already satisfied q"
>or something -- and write lots of tests of that method, and it doesn't
>really matter what algorithm is behind it.
>
>Take a look at SubstitutionVisitor.simplfy(RexNode) and how it focuses
>on finding whether
>
>  p1 AND p2 AND p3 AND NOT (q1 AND q2 AND q3)
>
>is satisfiable.
>
>Later we will want to know not just "can I satisfy query Q using
>materialization M?" but "can I satisfy part of Q using M, and what is
>left over?". I can convert most of Q to use an aggregate table over
>years 2012 .. 2014 and 2015 Jan - May, and then scan the raw data for
>June 1st onwards, that is a big win.
>
>What branch are you working on? Your master branch
>https://github.com/qubole/incubator-calcite/tree/master seems to be
>the same as apache/master right now.
>
>If you can divide this work into pull requests with unit tests, I will
>happy commit each change as you make progress.
>
>By the way, I logged a few jira cases connected to materialized view
>rewrite today. They were motivated by the phoenix team wanting to use
>secondary indexes. But they could by applied to any scan-project-sort
>materialization. See
>
>* https://issues.apache.org/jira/browse/CALCITE-771
>* https://issues.apache.org/jira/browse/CALCITE-772
>* https://issues.apache.org/jira/browse/CALCITE-773
>
>Julian
>
>On Thu, Jun 25, 2015 at 11:46 AM, Amogh Margoor <[email protected]> wrote:
>> Hi,
>> We were working on a problem to detect if materialized view can be used
>>to
>> rewrite a query in non-trivial cases. Will briefly describe the problem
>>and
>> approach below and would appreciate feedback on the same.
>>
>> Motivation
>> ---------------
>> For instance there exists a table "logs" and a partition (materialized
>> view)  named "log_after_01_Jan" created on it and described by SQL :
>> "Select * from logs where log_date > '01-01-2015' ".
>>
>> Assume that the table "log_after_01_Jan" is much smaller than  table
>>"logs".
>>
>>  For user query:
>> "Select log_body from logs where log_date > '03-03-2015' and
>> char_length(log_body) < 100",
>> we should detect that the materialized view "log_after_01_Jan" can be
>>used
>> and transform the query into:
>>
>> "Select log_body from log_after_01_Jan where log_date > '03-03-2015' and
>> char_length(log_body) < 100"
>>
>> Approach
>> --------------
>> One of the fundamental problems we would come across here is to check
>>if a
>> boolean condition X implies (=>) Y. This quickly reduces to the
>> Satisfiability problem which is NP complete for propositional logic. But
>> there are many instances like above which can be detected easily. We
>>have
>> implemented an approach to handle several useful cases for few operators
>> and types of operands. Will be extending it further for more types of
>> operations.
>>
>> Top Level approach:
>>
>> 1. Currently, VolcanoPlanner:useMaterialization tries to rewrite
>>original
>> query using MV using SubstitutionVisitor. Have extended
>>SubstitutionVisitor
>> to detect above cases and do the substitution.
>>
>> 2. To check if a condition X => Y,
>>    a. Convert both of them into Disjunctive Normal Form.
>>        Say X is transformed into  x_1 or x_2 or x_3 ... or x_m and
>>        Y is transformed into y_1 or y_2 ,... or  y_i, where any x_i and
>>y_i
>> are conjunctions of atomic predicates.
>>        For instance condition "(a>10 or b>20) and c <90" will be
>>converted
>> to DNF: (a>10 and c<90)  or (b>20 and c<90).
>>
>>    b. For X=>Y to be a tautology i.e., hold always true, every
>>conjunction
>> x_i should imply atleast one of the conjunction y_j.
>>        We wrote some set of simple heuristics to check if a conjunction
>>of
>> atomic predicates implies another.
>>       This also involves executing RexNode using RexImplExecutor.
>>
>> We have checked in code for this in our fork of
>> calcite(qubole/incubator-calcite). This is ongoing work and we will be
>> making many more improvements to it. If this is useful or anybody is
>> interested in giving feedback then I can share the commit so that we can
>> discuss about it and take it forward.
>>
>> Regards,
>> Amogh
>> Member of Technical Staff
>> Qubole
>

Reply via email to