Re: When will be the stats based join selector be implemented?

2015-10-08 Thread Li Gao
Hi Maryann,

Those are great pointers. Thanks for the detailed descriptions.

Thanks,
Li


On Thu, Oct 8, 2015 at 1:08 PM, Maryann Xue  wrote:

> Hi Li,
>
> What you are concerned here seems to be more of the knowledge of Calcite.
>
> Anyway in short Calcite works with rules. And you can think of applying a
> set of rules gives you a bunch of different query plans you could probably
> go with. Calcite then calculates the cumulative cost for each candidate
> (this is only the idea, but implementation differs a little bit) and picks
> the cheapest plan out of these candidates.
>
> So for example, we have several different implementations for joins in
> Phoenix, and those correspond to different physical operators in Calcite
> (PhoenixServerJoin.java, PhoenixClientJoin.java). We provide overrides the
> cost function ("computeSelfCost") trying to model it as close as the
> runtime overhead. But both versions (using PhoenixServerJoin and
> PhoenixClientJoin) exist in the candidates, and what comes cheaper is
> usually based on the join's input. Like if both sides of the join operator
> are sorted on the join keys, most likely the merge-join is going to chosen.
>
> There are quite a lot of general optimization rules provided by Calcite
> already (in the Calcite project), like the filter push down rule. There are
> also some Phoenix specific rules under org.apache.phoenix.calcite.rel.rules.
>
> For examples, you can look at CalciteIT.java, which contains some basic
> test cases as well as some interesting stuff.
>
>
> Thanks,
> Maryann
>
>
>
> On Thu, Oct 8, 2015 at 2:37 PM, Li Gao  wrote:
>
>> Hi Maryann,
>>
>> I am wondering if you could help me understand how the Phoenix calcite
>> branch is using Calcite to do query optimizations
>>
>> i.e.
>>
>>- some pointers to the code where the joins can detect whether a hash
>>join or a sort merge join should be used for a given case
>>- pointers to how the cost is calculated in the code
>>- pointers to how the filter predicate push down is implemented in
>>the code
>>
>> Examples  would be greatly appreciated.
>>
>> Thanks,
>> Li
>>
>>
>> On Mon, Oct 5, 2015 at 5:49 PM, Maryann Xue 
>> wrote:
>>
>>> Hi Li,
>>>
>>> Sorry, I forgot to mention that this calcite branch is now depending on
>>> Apache Calcite's master branch instead of any of its releases. So you need
>>> to checkout Calcite (git://github.com/apache/incubator-calcite.git)
>>> first and run `mvn install` for that project before going back to the
>>> Phoenix project and run mvn commands.
>>>
>>> On Mon, Oct 5, 2015 at 6:43 PM, Li Gao  wrote:
>>>
 Hi Maryann,

 This looks great. Thanks for pointing me to the right branch!  For some
 reason I am getting the following errors when I do mvn package

 [WARNING] The POM for
 org.apache.calcite:calcite-avatica:jar:1.5.0-incubating-SNAPSHOT is
 missing, no dependency information available

 [WARNING] The POM for
 org.apache.calcite:calcite-core:jar:1.5.0-incubating-SNAPSHOT is missing,
 no dependency information available

 [WARNING] The POM for
 org.apache.calcite:calcite-core:jar:tests:1.5.0-incubating-SNAPSHOT is
 missing, no dependency information available

 [WARNING] The POM for
 org.apache.calcite:calcite-linq4j:jar:1.5.0-incubating-SNAPSHOT is missing,
 no dependency information available

 Where can I find these dependencies?

 Thanks,

 Li



 On Mon, Oct 5, 2015 at 12:19 PM, Maryann Xue 
 wrote:

> Hi Li,
>
> We are moving towards integrating with Calcite as our stats based
> optimization now. You can checkout our calcite
> 
> branch and play with it if you are interested. It's still under
> development, but you can already see some amazing optimization examples in
> our test file CalciteIT.java. You can also go
> http://www.slideshare.net/HBaseCon/ecosystem-session-2-49044349 for
> more information.
>
>
> Thanks,
> Maryann
>
>
>
>
> On Mon, Oct 5, 2015 at 2:08 PM, Li Gao  wrote:
>
>> Hi all,
>>
>> I am currently looking into getting optimized joins based on table
>> stats. I noticed in the QueryCompile at line 232-234 is still saying 
>> "TODO".
>>
>>
>> https://github.com/apache/phoenix/blob/4.x-HBase-1.0/phoenix-core/src/main/java/org/apache/phoenix/compile/QueryCompiler.java
>>
>> We have a need to get the selector enabled based on the size of the
>> the LHS and RHS table.
>>
>> Thanks,
>> Li
>>
>
>

>>>
>>
>


Re: When will be the stats based join selector be implemented?

2015-10-08 Thread Li Gao
Hi Maryann,

I am wondering if you could help me understand how the Phoenix calcite
branch is using Calcite to do query optimizations

i.e.

   - some pointers to the code where the joins can detect whether a hash
   join or a sort merge join should be used for a given case
   - pointers to how the cost is calculated in the code
   - pointers to how the filter predicate push down is implemented in the
   code

Examples  would be greatly appreciated.

Thanks,
Li


On Mon, Oct 5, 2015 at 5:49 PM, Maryann Xue  wrote:

> Hi Li,
>
> Sorry, I forgot to mention that this calcite branch is now depending on
> Apache Calcite's master branch instead of any of its releases. So you need
> to checkout Calcite (git://github.com/apache/incubator-calcite.git) first
> and run `mvn install` for that project before going back to the Phoenix
> project and run mvn commands.
>
> On Mon, Oct 5, 2015 at 6:43 PM, Li Gao  wrote:
>
>> Hi Maryann,
>>
>> This looks great. Thanks for pointing me to the right branch!  For some
>> reason I am getting the following errors when I do mvn package
>>
>> [WARNING] The POM for
>> org.apache.calcite:calcite-avatica:jar:1.5.0-incubating-SNAPSHOT is
>> missing, no dependency information available
>>
>> [WARNING] The POM for
>> org.apache.calcite:calcite-core:jar:1.5.0-incubating-SNAPSHOT is missing,
>> no dependency information available
>>
>> [WARNING] The POM for
>> org.apache.calcite:calcite-core:jar:tests:1.5.0-incubating-SNAPSHOT is
>> missing, no dependency information available
>>
>> [WARNING] The POM for
>> org.apache.calcite:calcite-linq4j:jar:1.5.0-incubating-SNAPSHOT is missing,
>> no dependency information available
>>
>> Where can I find these dependencies?
>>
>> Thanks,
>>
>> Li
>>
>>
>>
>> On Mon, Oct 5, 2015 at 12:19 PM, Maryann Xue 
>> wrote:
>>
>>> Hi Li,
>>>
>>> We are moving towards integrating with Calcite as our stats based
>>> optimization now. You can checkout our calcite
>>> 
>>> branch and play with it if you are interested. It's still under
>>> development, but you can already see some amazing optimization examples in
>>> our test file CalciteIT.java. You can also go
>>> http://www.slideshare.net/HBaseCon/ecosystem-session-2-49044349 for
>>> more information.
>>>
>>>
>>> Thanks,
>>> Maryann
>>>
>>>
>>>
>>>
>>> On Mon, Oct 5, 2015 at 2:08 PM, Li Gao  wrote:
>>>
 Hi all,

 I am currently looking into getting optimized joins based on table
 stats. I noticed in the QueryCompile at line 232-234 is still saying 
 "TODO".


 https://github.com/apache/phoenix/blob/4.x-HBase-1.0/phoenix-core/src/main/java/org/apache/phoenix/compile/QueryCompiler.java

 We have a need to get the selector enabled based on the size of the the
 LHS and RHS table.

 Thanks,
 Li

>>>
>>>
>>
>


Re: When will be the stats based join selector be implemented?

2015-10-05 Thread Maryann Xue
Hi Li,

We are moving towards integrating with Calcite as our stats based
optimization now. You can checkout our calcite

branch and play with it if you are interested. It's still under
development, but you can already see some amazing optimization examples in
our test file CalciteIT.java. You can also go
http://www.slideshare.net/HBaseCon/ecosystem-session-2-49044349 for more
information.


Thanks,
Maryann




On Mon, Oct 5, 2015 at 2:08 PM, Li Gao  wrote:

> Hi all,
>
> I am currently looking into getting optimized joins based on table stats.
> I noticed in the QueryCompile at line 232-234 is still saying "TODO".
>
>
> https://github.com/apache/phoenix/blob/4.x-HBase-1.0/phoenix-core/src/main/java/org/apache/phoenix/compile/QueryCompiler.java
>
> We have a need to get the selector enabled based on the size of the the
> LHS and RHS table.
>
> Thanks,
> Li
>


Re: When will be the stats based join selector be implemented?

2015-10-05 Thread Maryann Xue
Hi Li,

Sorry, I forgot to mention that this calcite branch is now depending on
Apache Calcite's master branch instead of any of its releases. So you need
to checkout Calcite (git://github.com/apache/incubator-calcite.git) first
and run `mvn install` for that project before going back to the Phoenix
project and run mvn commands.

On Mon, Oct 5, 2015 at 6:43 PM, Li Gao  wrote:

> Hi Maryann,
>
> This looks great. Thanks for pointing me to the right branch!  For some
> reason I am getting the following errors when I do mvn package
>
> [WARNING] The POM for
> org.apache.calcite:calcite-avatica:jar:1.5.0-incubating-SNAPSHOT is
> missing, no dependency information available
>
> [WARNING] The POM for
> org.apache.calcite:calcite-core:jar:1.5.0-incubating-SNAPSHOT is missing,
> no dependency information available
>
> [WARNING] The POM for
> org.apache.calcite:calcite-core:jar:tests:1.5.0-incubating-SNAPSHOT is
> missing, no dependency information available
>
> [WARNING] The POM for
> org.apache.calcite:calcite-linq4j:jar:1.5.0-incubating-SNAPSHOT is missing,
> no dependency information available
>
> Where can I find these dependencies?
>
> Thanks,
>
> Li
>
>
>
> On Mon, Oct 5, 2015 at 12:19 PM, Maryann Xue 
> wrote:
>
>> Hi Li,
>>
>> We are moving towards integrating with Calcite as our stats based
>> optimization now. You can checkout our calcite
>> 
>> branch and play with it if you are interested. It's still under
>> development, but you can already see some amazing optimization examples in
>> our test file CalciteIT.java. You can also go
>> http://www.slideshare.net/HBaseCon/ecosystem-session-2-49044349 for more
>> information.
>>
>>
>> Thanks,
>> Maryann
>>
>>
>>
>>
>> On Mon, Oct 5, 2015 at 2:08 PM, Li Gao  wrote:
>>
>>> Hi all,
>>>
>>> I am currently looking into getting optimized joins based on table
>>> stats. I noticed in the QueryCompile at line 232-234 is still saying "TODO".
>>>
>>>
>>> https://github.com/apache/phoenix/blob/4.x-HBase-1.0/phoenix-core/src/main/java/org/apache/phoenix/compile/QueryCompiler.java
>>>
>>> We have a need to get the selector enabled based on the size of the the
>>> LHS and RHS table.
>>>
>>> Thanks,
>>> Li
>>>
>>
>>
>


Re: When will be the stats based join selector be implemented?

2015-10-05 Thread Li Gao
Hi Maryann,

This looks great. Thanks for pointing me to the right branch!  For some
reason I am getting the following errors when I do mvn package

[WARNING] The POM for
org.apache.calcite:calcite-avatica:jar:1.5.0-incubating-SNAPSHOT is
missing, no dependency information available

[WARNING] The POM for
org.apache.calcite:calcite-core:jar:1.5.0-incubating-SNAPSHOT is missing,
no dependency information available

[WARNING] The POM for
org.apache.calcite:calcite-core:jar:tests:1.5.0-incubating-SNAPSHOT is
missing, no dependency information available

[WARNING] The POM for
org.apache.calcite:calcite-linq4j:jar:1.5.0-incubating-SNAPSHOT is missing,
no dependency information available

Where can I find these dependencies?

Thanks,

Li



On Mon, Oct 5, 2015 at 12:19 PM, Maryann Xue  wrote:

> Hi Li,
>
> We are moving towards integrating with Calcite as our stats based
> optimization now. You can checkout our calcite
> 
> branch and play with it if you are interested. It's still under
> development, but you can already see some amazing optimization examples in
> our test file CalciteIT.java. You can also go
> http://www.slideshare.net/HBaseCon/ecosystem-session-2-49044349 for more
> information.
>
>
> Thanks,
> Maryann
>
>
>
>
> On Mon, Oct 5, 2015 at 2:08 PM, Li Gao  wrote:
>
>> Hi all,
>>
>> I am currently looking into getting optimized joins based on table stats.
>> I noticed in the QueryCompile at line 232-234 is still saying "TODO".
>>
>>
>> https://github.com/apache/phoenix/blob/4.x-HBase-1.0/phoenix-core/src/main/java/org/apache/phoenix/compile/QueryCompiler.java
>>
>> We have a need to get the selector enabled based on the size of the the
>> LHS and RHS table.
>>
>> Thanks,
>> Li
>>
>
>