[
https://issues.apache.org/jira/browse/CALCITE-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Krisztian Kasa updated CALCITE-6513:
------------------------------------
Description:
CALCITE-3774 addresses preventing merging projects when the resulting
expressions in the merged project are too complex and lead to slow compilation
or out of memory.
However, when there is a {{Filter}} on top of the {{Projects}} with a predicate
referencing the complex expressions {{FilterProjectTransposeRule}} tries to
push down the {{Filter}} below the bottom {{Project}} merging the expressions
and causing OOM.
The issue was initially reproduced using Hive with the Hive version of
{{FilterProjectTransposeRule}}. See: HIVE-28264
Calcite is also affected:
[https://github.com/kasakrisz/calcite/commit/b35a02f368624a9c4768f348cd072a95ed6de3e1]
Let's see the following query
{code}
SELECT x1 from
(SELECT 'L1' || x0 || x0 || x0 || x0 as x1 from
(SELECT 'L0' || ENAME || ENAME || ENAME || ENAME as x0 from emp) t1) t2
WHERE x1 = 'Something'
{code}
Let's set the bloat property of RelBuilder.Config to 3.
The initial plan of the query is:
{code}
LogicalProject(X1=[$0])
LogicalFilter(condition=[=($0, 'Something')])
LogicalProject(X1=[||(||(||(||('L1', $0), $0), $0), $0)])
LogicalProject(X0=[||(||(||(||('L0', $1), $1), $1), $1)])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
{code}
The expressions in the {{Project}} operators are mergeable, but the resulting
expression's complexity exceeds the limit of 3 in our example.
However, while applying {{FilterProjectTransposeRule}} the expressions in the
{{Project}} operators are merged because the expression in the upper
{{Project}} references the expression in the lower {{Project}} and the
predicate in the {{Filter}} operator also references it. The limit is not
applied this case, so we end up with a plan
{code}
LogicalProject(X1=[$0])
LogicalProject(X1=[||(||(||(||('L1', $0), $0), $0), $0)])
LogicalProject(X0=[||(||(||(||('L0', $1), $1), $1), $1)])
LogicalFilter(condition=[=(||(||(||(||('L1', ||(||(||(||('L0', $1), $1),
$1), $1)), ||(||(||(||('L0', $1), $1), $1), $1)), ||(||(||(||('L0', $1), $1),
$1), $1)), ||(||(||(||('L0', $1), $1), $1), $1)), 'Something')])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
{code}
was:
CALCITE-3774 addresses preventing merging projects when the resulting
expressions in the merged project are too complex and lead to slow compilation
or out of memory.
However, when there is a {{Filter}} on top of the {{Projects}} with a predicate
referencing the complex expressions {{FilterProjectTransposeRule}} tries to
push down the {{Filter}} below the bottom {{Project}} merging the expressions
and causing OOM.
The issue was initially reproduced using Hive with the Hive version of
{{FilterProjectTransposeRule}}. See: HIVE-28264
Calcite is also affected:
[https://github.com/kasakrisz/calcite/commit/b35a02f368624a9c4768f348cd072a95ed6de3e1]
> FilterProjectTransposeRule may cause OOM when Project expressions are complex
> -----------------------------------------------------------------------------
>
> Key: CALCITE-6513
> URL: https://issues.apache.org/jira/browse/CALCITE-6513
> Project: Calcite
> Issue Type: Bug
> Components: core
> Reporter: Krisztian Kasa
> Assignee: Krisztian Kasa
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.38.0
>
>
> CALCITE-3774 addresses preventing merging projects when the resulting
> expressions in the merged project are too complex and lead to slow
> compilation or out of memory.
> However, when there is a {{Filter}} on top of the {{Projects}} with a
> predicate referencing the complex expressions {{FilterProjectTransposeRule}}
> tries to push down the {{Filter}} below the bottom {{Project}} merging the
> expressions and causing OOM.
> The issue was initially reproduced using Hive with the Hive version of
> {{FilterProjectTransposeRule}}. See: HIVE-28264
> Calcite is also affected:
> [https://github.com/kasakrisz/calcite/commit/b35a02f368624a9c4768f348cd072a95ed6de3e1]
> Let's see the following query
> {code}
> SELECT x1 from
> (SELECT 'L1' || x0 || x0 || x0 || x0 as x1 from
> (SELECT 'L0' || ENAME || ENAME || ENAME || ENAME as x0 from emp) t1)
> t2
> WHERE x1 = 'Something'
> {code}
> Let's set the bloat property of RelBuilder.Config to 3.
> The initial plan of the query is:
> {code}
> LogicalProject(X1=[$0])
> LogicalFilter(condition=[=($0, 'Something')])
> LogicalProject(X1=[||(||(||(||('L1', $0), $0), $0), $0)])
> LogicalProject(X0=[||(||(||(||('L0', $1), $1), $1), $1)])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> The expressions in the {{Project}} operators are mergeable, but the resulting
> expression's complexity exceeds the limit of 3 in our example.
> However, while applying {{FilterProjectTransposeRule}} the expressions in the
> {{Project}} operators are merged because the expression in the upper
> {{Project}} references the expression in the lower {{Project}} and the
> predicate in the {{Filter}} operator also references it. The limit is not
> applied this case, so we end up with a plan
> {code}
> LogicalProject(X1=[$0])
> LogicalProject(X1=[||(||(||(||('L1', $0), $0), $0), $0)])
> LogicalProject(X0=[||(||(||(||('L0', $1), $1), $1), $1)])
> LogicalFilter(condition=[=(||(||(||(||('L1', ||(||(||(||('L0', $1),
> $1), $1), $1)), ||(||(||(||('L0', $1), $1), $1), $1)), ||(||(||(||('L0', $1),
> $1), $1), $1)), ||(||(||(||('L0', $1), $1), $1), $1)), 'Something')])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)