[
https://issues.apache.org/jira/browse/CALCITE-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Runkang He reassigned CALCITE-5661:
-----------------------------------
Assignee: Runkang He
> Introduce another way to convert IN predicate to RelNode when IN list is large
> ------------------------------------------------------------------------------
>
> Key: CALCITE-5661
> URL: https://issues.apache.org/jira/browse/CALCITE-5661
> Project: Calcite
> Issue Type: Improvement
> Affects Versions: 1.34.0
> Reporter: Runkang He
> Assignee: Runkang He
> Priority: Major
>
> When IN list is large, the plan generation is time-consuming, after
> benchmark, when the IN value list size was 3w, it took 2 minutes to generate
> the final plan.
> {code:sql}
> select empno from emp where deptno in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
> ..., 30000){code}
> We find that in sql-to-rel phase, there are two methods to convert IN
> predicate to RelNode:
> 1.IN list size is below InSubQueryThreshold, convert IN to OR;
> 2.IN list size is over InSubQueryThreshold, convert IN to VALUES + JOIN.
> The first one will be very time-consuming in the expression simplification
> stage for the large OR predicate. As mentioned before, when the IN value list
> size was 3w, it took 2 minutes, which is not acceptable in OLAP scenarios.
> The second one will not be able to apply predicate pushdown, which it is very
> important in OLAP scenarios.
> So maybe we need to support converting IN to RexCall directly to avoid the
> disadvantages of the above two methods.
> After POC, when convert IN to RexCall directly, it takes less than 1 second
> to generate the final plan.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)