Calcite does support table hint now, it's syntax is Oracle style[1], i saw that many engines support a INDEX hint to force a index scan on table[2] [3], maybe you can have a try also.
[1] https://calcite.apache.org/docs/reference.html#sql-hints [2] https://docs.oracle.com/cd/B13789_01/server.101/b10752/hintsref.htm#5156 [3] https://docs.microsoft.com/en-us/sql/t-sql/queries/hints-transact-sql-table?view=sql-server-ver15 Best, Danny Chan 在 2020年5月29日 +0800 PM4:44,Tim Fox <[email protected]>,写道: > Hi, > > I'm building a query engine with Calcite - really enjoying working with > Calcite so far! > > When creating a plan, it seems Calcite always creates a plan where the > sources are table scans, however in my implementation the tables can have > indexes on them so a table scan is not always the right choice. > > I was wondering if there was any way of making Calcite "index aware" - e.g. > perhaps providing hints to the table scan instance that, actually, an index > scan or a primary key lookup should be used instead of actually scanning > the table. E.g. On the table meta-data if we provided information about any > indexes on the table, then Calcite could figure out what parts of the query > to push to the table scan and which to keep in the rest of the plan. > > There are two specific cases I really care about: > > 1. Queries that contain a primary key lookup: > > select * from some_table where key_column=23 AND some_other_column='foo'; > > In the above case the 'select * from some_table where key_column=23' can be > implemented as a simple PK lookup in the source table, not requiring a > scan, thus leaving just the filter corresponding to > 'some_other_column='foo'' in the rest of the plan > > 2. Queries with expressions on a column which has a secondary index > > select * from users where country='UK' and some_other_column='foo'; > > We have many users, and let's say 10% of them are from UK (still a lot). We > have a secondary index in the country column in the source table so we can > do an efficient index scan to retrieve the matching records. > > I found this document > https://calcite.apache.org/docs/materialized_views.html which seems like it > might help me in some way. > > The idea being if I can think of my indexes as materialized views then the > query can be written against those materialized views as sources instead of > the original table sources. There appears to be a rule > 'MaterializedViewRule' that does this already (?). > > This seems to get me a bit further, however, for this approach to work, it > seems I would have to create materialized views _dynamically_ during > evaluation of the query, register them, rewrite the query, execute it, then > deregister the materialized view. > > E.g. for the primary key lookup example above, for the following query: > > select * from some_table where key_column=23 AND some_other_column='foo'; > > I would need to dynamically create a materialized view corresponding to: > > select * from some_table where key_column=23 > > Then rewrite the query using MaterializedViewRule. > > In the general case, in order to figure out what materialized views I need > to dynamically create I would need to examine the query, figure out which > columns in expressions have indexes on them and from them work out the best > materialized view to create based on that information. This seems non > trivial. > > Does anyone have any suggestions or pointers for how to implement this kind > of thing? I suspect I'm not the first person to have tried to do this, as > using indexes on tables seems a pretty common thing in many systems (?)
