Hi Maryann, Those are great pointers. Thanks for the detailed descriptions.
Thanks, Li On Thu, Oct 8, 2015 at 1:08 PM, Maryann Xue <[email protected]> wrote: > Hi Li, > > What you are concerned here seems to be more of the knowledge of Calcite. > > Anyway in short Calcite works with rules. And you can think of applying a > set of rules gives you a bunch of different query plans you could probably > go with. Calcite then calculates the cumulative cost for each candidate > (this is only the idea, but implementation differs a little bit) and picks > the cheapest plan out of these candidates. > > So for example, we have several different implementations for joins in > Phoenix, and those correspond to different physical operators in Calcite > (PhoenixServerJoin.java, PhoenixClientJoin.java). We provide overrides the > cost function ("computeSelfCost") trying to model it as close as the > runtime overhead. But both versions (using PhoenixServerJoin and > PhoenixClientJoin) exist in the candidates, and what comes cheaper is > usually based on the join's input. Like if both sides of the join operator > are sorted on the join keys, most likely the merge-join is going to chosen. > > There are quite a lot of general optimization rules provided by Calcite > already (in the Calcite project), like the filter push down rule. There are > also some Phoenix specific rules under org.apache.phoenix.calcite.rel.rules. > > For examples, you can look at CalciteIT.java, which contains some basic > test cases as well as some interesting stuff. > > > Thanks, > Maryann > > > > On Thu, Oct 8, 2015 at 2:37 PM, Li Gao <[email protected]> wrote: > >> Hi Maryann, >> >> I am wondering if you could help me understand how the Phoenix calcite >> branch is using Calcite to do query optimizations >> >> i.e. >> >> - some pointers to the code where the joins can detect whether a hash >> join or a sort merge join should be used for a given case >> - pointers to how the cost is calculated in the code >> - pointers to how the filter predicate push down is implemented in >> the code >> >> Examples would be greatly appreciated. >> >> Thanks, >> Li >> >> >> On Mon, Oct 5, 2015 at 5:49 PM, Maryann Xue <[email protected]> >> wrote: >> >>> Hi Li, >>> >>> Sorry, I forgot to mention that this calcite branch is now depending on >>> Apache Calcite's master branch instead of any of its releases. So you need >>> to checkout Calcite (git://github.com/apache/incubator-calcite.git) >>> first and run `mvn install` for that project before going back to the >>> Phoenix project and run mvn commands. >>> >>> On Mon, Oct 5, 2015 at 6:43 PM, Li Gao <[email protected]> wrote: >>> >>>> Hi Maryann, >>>> >>>> This looks great. Thanks for pointing me to the right branch! For some >>>> reason I am getting the following errors when I do mvn package >>>> >>>> [WARNING] The POM for >>>> org.apache.calcite:calcite-avatica:jar:1.5.0-incubating-SNAPSHOT is >>>> missing, no dependency information available >>>> >>>> [WARNING] The POM for >>>> org.apache.calcite:calcite-core:jar:1.5.0-incubating-SNAPSHOT is missing, >>>> no dependency information available >>>> >>>> [WARNING] The POM for >>>> org.apache.calcite:calcite-core:jar:tests:1.5.0-incubating-SNAPSHOT is >>>> missing, no dependency information available >>>> >>>> [WARNING] The POM for >>>> org.apache.calcite:calcite-linq4j:jar:1.5.0-incubating-SNAPSHOT is missing, >>>> no dependency information available >>>> >>>> Where can I find these dependencies? >>>> >>>> Thanks, >>>> >>>> Li >>>> >>>> >>>> >>>> On Mon, Oct 5, 2015 at 12:19 PM, Maryann Xue <[email protected]> >>>> wrote: >>>> >>>>> Hi Li, >>>>> >>>>> We are moving towards integrating with Calcite as our stats based >>>>> optimization now. You can checkout our calcite >>>>> <https://git1-us-west.apache.org/repos/asf?p=phoenix.git;a=shortlog;h=refs/heads/calcite> >>>>> branch and play with it if you are interested. It's still under >>>>> development, but you can already see some amazing optimization examples in >>>>> our test file CalciteIT.java. You can also go >>>>> http://www.slideshare.net/HBaseCon/ecosystem-session-2-49044349 for >>>>> more information. >>>>> >>>>> >>>>> Thanks, >>>>> Maryann >>>>> >>>>> >>>>> >>>>> >>>>> On Mon, Oct 5, 2015 at 2:08 PM, Li Gao <[email protected]> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I am currently looking into getting optimized joins based on table >>>>>> stats. I noticed in the QueryCompile at line 232-234 is still saying >>>>>> "TODO". >>>>>> >>>>>> >>>>>> https://github.com/apache/phoenix/blob/4.x-HBase-1.0/phoenix-core/src/main/java/org/apache/phoenix/compile/QueryCompiler.java >>>>>> >>>>>> We have a need to get the selector enabled based on the size of the >>>>>> the LHS and RHS table. >>>>>> >>>>>> Thanks, >>>>>> Li >>>>>> >>>>> >>>>> >>>> >>> >> >
