[
https://issues.apache.org/jira/browse/CALCITE-7232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18030419#comment-18030419
]
Alessandro Solimando commented on CALCITE-7232:
-----------------------------------------------
Translating a SEARCH operator to a tree of ORs for what came in as an IN
operator doesn't feel great, I agree.
>From my memory of Apache Hive internals, there are some rules (e.g.,
>PointLookupQuery) matching on IN operators, and some post-planning code
>relying on having IN operators in the output plan (e.g., partition pruning).
So I guess translating to IN at the "boundary" won't work for you, as it
wouldn't address the first use-case during planning?
Ideally we could meet half-way, support re-translating SEARCH efficiently to IN
(instead of ORs) after optimization (after the best plan is returned) as a
RelToRel operation, and custom planner rules adapts to SEARCH?
I wonder if a utility class providing "answers" over a SEARCH operator for
whatever is interesting to ask for an IN operator, to be used as a drop-in for
evolving custom rules using IN?
On top of that we could also envision a RelToRel operation (a visitor?) that
can explicitly translate SEARCH to IN.
WDYT?
> Restore use of IN operator in RexCall
> -------------------------------------
>
> Key: CALCITE-7232
> URL: https://issues.apache.org/jira/browse/CALCITE-7232
> Project: Calcite
> Issue Type: Task
> Reporter: Stamatis Zampetakis
> Priority: Major
>
> The use of {{IN}} operator in {{RexCall}} was superseded by the introduction
> of the {{SEARCH}} operator (CALCITE-4173) and its use is strictly forbidden
> through
> [assertions|https://github.com/apache/calcite/blob/6cbbf560b721cb88354c33751aa72b16a58ded23/core/src/main/java/org/apache/calcite/rex/RexCall.java#L94].
> The {{SEARCH}} operator is more general and powerful than {{IN}} so it's a
> perfect abstraction to use during the optimization phase.
> However, most databases don't have a {{SEARCH}} operator so the latter needs
> to be transformed back to {{IN}} (or something else) at some point in time.
> For instance, Apache Hive has two ways of generating an executable plan:
> * take a {{RelNode}} and generate an AST tree
> * take a {{RelNode}} and generate a Hive Operator tree
> both of which are eventually going to be executed.
> *If we don't allow* IN in a RexCall, then it means that we need to create
> special code to handle SEARCH in both code paths that differ only slightly in
> each case. (In reality the situation is more complicated for Hive because
> there are at least two more places where we need to do a SEARCH to IN
> transformation).
> *If we allow IN* in a RexCall, then at the end of the RelNode optimization
> phase we can "expand" {{SEARCH}} to {{IN}} so the transformation logic only
> appears in one place and it remains a {{RelNode}} to {{RelNode}} conversion.
> In fact, the same transformation logic could be exploited in
> [SqlImplementor|https://github.com/apache/calcite/blob/6cbbf560b721cb88354c33751aa72b16a58ded23/core/src/main/java/org/apache/calcite/rel/rel2sql/SqlImplementor.java#L815]
> that does another {{RelNode}} to "something" conversion.
> The obvious downside with this proposal is that if people start mixing the IN
> operator in various optimization rules/phases it can certainly affect the
> quality of the plans and the planning time.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)