[ 
https://issues.apache.org/jira/browse/CALCITE-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated CALCITE-6513:
------------------------------------
    Description: 
CALCITE-3774 addresses preventing merging projects when the resulting 
expressions in the merged project are too complex and lead to slow compilation 
or out of memory.

However, when there is a {{Filter}} on top of the {{Projects}} with a predicate 
referencing the complex expressions {{FilterProjectTransposeRule}} tries to 
push down the {{Filter}} below the bottom {{Project}} merging the expressions 
and causing OOM.

The issue was initially reproduced using Hive with the Hive version of 
{{FilterProjectTransposeRule}}. See: HIVE-28264

Calcite is also affected: 
[https://github.com/kasakrisz/calcite/commit/b35a02f368624a9c4768f348cd072a95ed6de3e1]

Let's see the following query
{code}
SELECT x1 from
    (SELECT 'L1' || x0  || x0 || x0 || x0 as x1 from
        (SELECT 'L0' || ENAME || ENAME || ENAME || ENAME as x0 from emp) t1) t2
WHERE x1 = 'Something'
{code}
Let's set the bloat property of RelBuilder.Config to 3.
The initial plan of the query is:
{code}
LogicalProject(X1=[$0])
  LogicalFilter(condition=[=($0, 'Something')])
    LogicalProject(X1=[||(||(||(||('L1', $0), $0), $0), $0)])
      LogicalProject(X0=[||(||(||(||('L0', $1), $1), $1), $1)])
        LogicalTableScan(table=[[CATALOG, SALES, EMP]])
{code}
The expressions in the {{Project}} operators are mergeable, but the resulting 
expression's complexity exceeds the limit of 3 in our example.
However, while applying {{FilterProjectTransposeRule}} the expressions in the 
{{Project}} operators are merged because the expression in the upper 
{{Project}} references the expression in the lower {{Project}} and the 
predicate in the {{Filter}} operator also references it. The limit is not 
applied this case, so we end up with a plan
{code}
LogicalProject(X1=[$0])
  LogicalProject(X1=[||(||(||(||('L1', $0), $0), $0), $0)])
    LogicalProject(X0=[||(||(||(||('L0', $1), $1), $1), $1)])
      LogicalFilter(condition=[=(||(||(||(||('L1', ||(||(||(||('L0', $1), $1), 
$1), $1)), ||(||(||(||('L0', $1), $1), $1), $1)), ||(||(||(||('L0', $1), $1), 
$1), $1)), ||(||(||(||('L0', $1), $1), $1), $1)), 'Something')])
        LogicalTableScan(table=[[CATALOG, SALES, EMP]])
{code}


  was:
CALCITE-3774 addresses preventing merging projects when the resulting 
expressions in the merged project are too complex and lead to slow compilation 
or out of memory.

However, when there is a {{Filter}} on top of the {{Projects}} with a predicate 
referencing the complex expressions {{FilterProjectTransposeRule}} tries to 
push down the {{Filter}} below the bottom {{Project}} merging the expressions 
and causing OOM.

The issue was initially reproduced using Hive with the Hive version of 
{{FilterProjectTransposeRule}}. See: HIVE-28264

Calcite is also affected: 
[https://github.com/kasakrisz/calcite/commit/b35a02f368624a9c4768f348cd072a95ed6de3e1]



> FilterProjectTransposeRule may cause OOM when Project expressions are complex
> -----------------------------------------------------------------------------
>
>                 Key: CALCITE-6513
>                 URL: https://issues.apache.org/jira/browse/CALCITE-6513
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Krisztian Kasa
>            Assignee: Krisztian Kasa
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.38.0
>
>
> CALCITE-3774 addresses preventing merging projects when the resulting 
> expressions in the merged project are too complex and lead to slow 
> compilation or out of memory.
> However, when there is a {{Filter}} on top of the {{Projects}} with a 
> predicate referencing the complex expressions {{FilterProjectTransposeRule}} 
> tries to push down the {{Filter}} below the bottom {{Project}} merging the 
> expressions and causing OOM.
> The issue was initially reproduced using Hive with the Hive version of 
> {{FilterProjectTransposeRule}}. See: HIVE-28264
> Calcite is also affected: 
> [https://github.com/kasakrisz/calcite/commit/b35a02f368624a9c4768f348cd072a95ed6de3e1]
> Let's see the following query
> {code}
> SELECT x1 from
>     (SELECT 'L1' || x0  || x0 || x0 || x0 as x1 from
>         (SELECT 'L0' || ENAME || ENAME || ENAME || ENAME as x0 from emp) t1) 
> t2
> WHERE x1 = 'Something'
> {code}
> Let's set the bloat property of RelBuilder.Config to 3.
> The initial plan of the query is:
> {code}
> LogicalProject(X1=[$0])
>   LogicalFilter(condition=[=($0, 'Something')])
>     LogicalProject(X1=[||(||(||(||('L1', $0), $0), $0), $0)])
>       LogicalProject(X0=[||(||(||(||('L0', $1), $1), $1), $1)])
>         LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> The expressions in the {{Project}} operators are mergeable, but the resulting 
> expression's complexity exceeds the limit of 3 in our example.
> However, while applying {{FilterProjectTransposeRule}} the expressions in the 
> {{Project}} operators are merged because the expression in the upper 
> {{Project}} references the expression in the lower {{Project}} and the 
> predicate in the {{Filter}} operator also references it. The limit is not 
> applied this case, so we end up with a plan
> {code}
> LogicalProject(X1=[$0])
>   LogicalProject(X1=[||(||(||(||('L1', $0), $0), $0), $0)])
>     LogicalProject(X0=[||(||(||(||('L0', $1), $1), $1), $1)])
>       LogicalFilter(condition=[=(||(||(||(||('L1', ||(||(||(||('L0', $1), 
> $1), $1), $1)), ||(||(||(||('L0', $1), $1), $1), $1)), ||(||(||(||('L0', $1), 
> $1), $1), $1)), ||(||(||(||('L0', $1), $1), $1), $1)), 'Something')])
>         LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to