Re: Adding support for Ignite secondary indexes to Apache Calcite planner

Zhenya Stanilovsky Tue, 10 Dec 2019 05:41:12 -0800
Roman just as fast remark, Phoenix builds their approach on already existing 
monolith HBase architecture, most cases it`s just a stub for someone who wants 
use secondary indexes with a base with no native support of it. Don`t think 
it`s good idea here.
   
>
>
>------- Forwarded message -------
>From: "Roman Kondakov" < [email protected] >
>To:  [email protected]
>Cc:
>Subject: Adding support for Ignite secondary indexes to Apache Calcite
>planner
>Date: Tue, 10 Dec 2019 15:55:52 +0300
>
>Hi all!
>
>As you may know there is an activity on integration of Apache Calcite
>query optimizer into Ignite codebase is being carried out [1],[2].
>
>One of a bunch of problems in this integration is the absence of
>out-of-the-box support for secondary indexes in Apache Calcite. After
>some research I came to conclusion that this problem has a couple of
>workarounds. Let's name them
>1. Phoenix-style approach - representing secondary indexes as
>materialized views which are natively supported by Calcite engine [3]
>2. Drill-style approach - pushing filters into the table scans and
>choose appropriate index for lookups when possible [4]
>
>Both these approaches have advantages and disadvantages:
>
>Phoenix style pros:
>- natural way of adding indexes as an alternative source of rows: index
>can be considered as a kind of sorted materialized view.
>- possibility of using index sortedness for stream aggregates,
>deduplication (DISTINCT operator), merge joins, etc.
>- ability to support other types of indexes (i.e. functional indexes).
>
>Phoenix style cons:
>- polluting optimizer's search space extra table scans hence increasing
>the planning time.
>
>Drill style pros:
>- easier to implement (although it's questionable).
>- search space is not inflated.
>
>Drill style cons:
>- missed opportunity to exploit sortedness.
>
>There is a good discussion about using both approaches can be found in [5].
>
>I made a small sketch [6] in order to demonstrate the applicability of
>the Phoenix approach to Ignite. Key design concepts are:
>1. On creating indexes are registered as tables in Calcite schema. This
>step is needed for internal Calcite's routines.
>2. On planner initialization we register these indexes as materialized
>views in Calcite's optimizer using VolcanoPlanner#addMaterialization
>method.
>3. Right before the query execution Calcite selects all materialized
>views (indexes) which can be potentially used in query.
>4. During the query optimization indexes are registered by planner as
>usual TableScans and hence can be chosen by optimizer if they have lower
>cost.
>
>This sketch shows the ability to exploit index sortedness only. So the
>future work in this direction should be focused on using indexes for
>fast index lookups. At first glance FilterableTable and
>FilterTableScanRule are good points to start. We can push Filter into
>the TableScan and then use FilterableTable for fast index lookups
>avoiding reading the whole index on TableScan step and then filtering
>its output on the Filter step.
>
>What do you think?
>
>
>
>[1]
>http://apache-ignite-developers.2346864.n4.nabble.com/New-SQL-execution-engine-tt43724.html#none
>[2]
>https://cwiki.apache.org/confluence/display/IGNITE/IEP-37%3A+New+query+execution+engine
>[3]  https://issues.apache.org/jira/browse/PHOENIX-2047
>[4]  https://issues.apache.org/jira/browse/DRILL-6381
>[5]  https://issues.apache.org/jira/browse/DRILL-3929
>[6]  https://github.com/apache/ignite/pull/7115
Re: Adding support for Ignite secondary indexes to Apache Calcite planner

Reply via email to