[ 
https://issues.apache.org/jira/browse/CALCITE-7232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18030419#comment-18030419
 ] 

Alessandro Solimando commented on CALCITE-7232:
-----------------------------------------------

Translating a SEARCH operator to a tree of ORs for what came in as an IN 
operator doesn't feel great, I agree.

>From my memory of Apache Hive internals, there are some rules (e.g., 
>PointLookupQuery) matching on IN operators, and some post-planning code 
>relying on having IN operators in the output plan (e.g., partition pruning).

So I guess translating to IN at the "boundary" won't work for you, as it 
wouldn't address the first use-case during planning?

Ideally we could meet half-way, support re-translating SEARCH efficiently to IN 
(instead of ORs) after optimization (after the best plan is returned) as a 
RelToRel operation, and custom planner rules adapts to SEARCH?

I wonder if a utility class providing "answers" over a SEARCH operator for 
whatever is interesting to ask for an IN operator, to be used as a drop-in for 
evolving custom rules using IN?

On top of that we could also envision a RelToRel operation (a visitor?) that 
can explicitly translate SEARCH to IN.

WDYT?

> Restore use of IN operator in RexCall
> -------------------------------------
>
>                 Key: CALCITE-7232
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7232
>             Project: Calcite
>          Issue Type: Task
>            Reporter: Stamatis Zampetakis
>            Priority: Major
>
> The use of {{IN}} operator in {{RexCall}} was superseded by the introduction 
> of the {{SEARCH}} operator (CALCITE-4173) and its use is strictly forbidden 
> through 
> [assertions|https://github.com/apache/calcite/blob/6cbbf560b721cb88354c33751aa72b16a58ded23/core/src/main/java/org/apache/calcite/rex/RexCall.java#L94].
>  The {{SEARCH}} operator is more general and powerful than {{IN}} so it's a 
> perfect abstraction to use during the optimization phase.
> However, most databases don't have a {{SEARCH}} operator so the latter needs 
> to be transformed back to {{IN}} (or something else) at some point in time. 
> For instance, Apache Hive has two ways of generating an executable plan:
>  * take a {{RelNode}} and generate an AST tree
>  * take a {{RelNode}} and generate a Hive Operator tree
> both of which are eventually going to be executed.
> *If we don't allow* IN in a RexCall, then it means that we need to create 
> special code to handle SEARCH in both code paths that differ only slightly in 
> each case. (In reality the situation is more complicated for Hive because 
> there are at least two more places where we need to do a SEARCH to IN 
> transformation).
> *If we allow IN* in a RexCall, then at the end of the RelNode optimization 
> phase we can "expand" {{SEARCH}} to {{IN}} so the transformation logic only 
> appears in one place and it remains a {{RelNode}} to {{RelNode}} conversion. 
> In fact, the same transformation logic could be exploited in 
> [SqlImplementor|https://github.com/apache/calcite/blob/6cbbf560b721cb88354c33751aa72b16a58ded23/core/src/main/java/org/apache/calcite/rel/rel2sql/SqlImplementor.java#L815]
>  that does another {{RelNode}} to "something" conversion.
> The obvious downside with this proposal is that if people start mixing the IN 
> operator in various optimization rules/phases it can certainly affect the 
> quality of the plans and the planning time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to