[ 
https://issues.apache.org/jira/browse/HIVE-27296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-27296:
---------------------------------------
    Description: 
The {{HiveRelDecorrelator}} does not cope well with {{Values}} expressions and 
when such expression exists in the plan it fails to remove the respective 
{{{}Correlate{}}}.

In HIVE-27278, we discovered a query that has a correlation over an empty 
{{Values}} expression.
{code:sql}
EXPLAIN CBO SELECT id FROM t1 WHERE NULL IN (SELECT NULL FROM t2 where t1.id = 
t2.id);{code}
The CBO plan after decorrelation is shown below.
{noformat}
HiveProject(id=[$0])
  LogicalCorrelate(correlation=[$cor0], joinType=[semi], requiredColumns=[{}])
    HiveTableScan(table=[[default, t1]], table:alias=[t1])
    HiveValues(tuples=[[]])
{noformat}
Although, in HIVE-27278 we could find a solution for a plan that contains an 
empty {{Values}} there can be queries with correlations on non-empty {{Values}} 
and for those we don't have a solution at the moment.

Normally after decorrelation we shouldn't have any {{Correlate}} expressions in 
the plan.

The problem starts from 
[HiveRelDecorrelator.decorrelate(Values)|https://github.com/apache/hive/blob/59058c65457fb7ab9d8575a555034e6633962661/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java#L471]
 that returns null when it encounters the {{Values}} expression.

Later, in 
[HiveRelDecorrelator.decorrelate(Correlate)|https://github.com/apache/hive/blob/59058c65457fb7ab9d8575a555034e6633962661/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java#L1247]
 it will bail out when treating the {{Correlate}} since one of the inputs is 
not rewritten.

The problem is still there in latest Calcite (CALCITE-5568).

  was:
The {{HiveRelDecorrelator}} does not cope well with {{Values}} expressions and 
when such expression exists in the plan it fails to remove the respective 
{{Correlate}}.

In HIVE-27298, we discovered a query that has a correlation over an empty 
{{Values}} expression.
{code:sql}
EXPLAIN CBO SELECT id FROM t1 WHERE NULL IN (SELECT NULL FROM t2 where t1.id = 
t2.id);{code}

The CBO plan after decorrelation is shown below.
{noformat}
HiveProject(id=[$0])
  LogicalCorrelate(correlation=[$cor0], joinType=[semi], requiredColumns=[{}])
    HiveTableScan(table=[[default, t1]], table:alias=[t1])
    HiveValues(tuples=[[]])
{noformat}

Although, in HIVE-27298 we could find a solution for a plan that contains an 
empty {{Values}} there can be queries with correlations on non-empty {{Values}} 
and for those we don't have a solution at the moment.

Normally after decorrelation we shouldn't have any {{Correlate}} expressions in 
the plan.

The problem starts from 
[HiveRelDecorrelator.decorrelate(Values)|https://github.com/apache/hive/blob/59058c65457fb7ab9d8575a555034e6633962661/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java#L471]
 that returns null when it encounters the {{Values}} expression.

Later, in 
[HiveRelDecorrelator.decorrelate(Correlate)|https://github.com/apache/hive/blob/59058c65457fb7ab9d8575a555034e6633962661/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java#L1247]
 it will bail out when treating the {{Correlate}} since one of the inputs is 
not rewritten. 

The problem is still there in latest Calcite (CALCITE-5568).


> HiveRelDecorrelator does not handle correlation with Values
> -----------------------------------------------------------
>
>                 Key: HIVE-27296
>                 URL: https://issues.apache.org/jira/browse/HIVE-27296
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Stamatis Zampetakis
>            Assignee: Stamatis Zampetakis
>            Priority: Major
>
> The {{HiveRelDecorrelator}} does not cope well with {{Values}} expressions 
> and when such expression exists in the plan it fails to remove the respective 
> {{{}Correlate{}}}.
> In HIVE-27278, we discovered a query that has a correlation over an empty 
> {{Values}} expression.
> {code:sql}
> EXPLAIN CBO SELECT id FROM t1 WHERE NULL IN (SELECT NULL FROM t2 where t1.id 
> = t2.id);{code}
> The CBO plan after decorrelation is shown below.
> {noformat}
> HiveProject(id=[$0])
>   LogicalCorrelate(correlation=[$cor0], joinType=[semi], requiredColumns=[{}])
>     HiveTableScan(table=[[default, t1]], table:alias=[t1])
>     HiveValues(tuples=[[]])
> {noformat}
> Although, in HIVE-27278 we could find a solution for a plan that contains an 
> empty {{Values}} there can be queries with correlations on non-empty 
> {{Values}} and for those we don't have a solution at the moment.
> Normally after decorrelation we shouldn't have any {{Correlate}} expressions 
> in the plan.
> The problem starts from 
> [HiveRelDecorrelator.decorrelate(Values)|https://github.com/apache/hive/blob/59058c65457fb7ab9d8575a555034e6633962661/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java#L471]
>  that returns null when it encounters the {{Values}} expression.
> Later, in 
> [HiveRelDecorrelator.decorrelate(Correlate)|https://github.com/apache/hive/blob/59058c65457fb7ab9d8575a555034e6633962661/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java#L1247]
>  it will bail out when treating the {{Correlate}} since one of the inputs is 
> not rewritten.
> The problem is still there in latest Calcite (CALCITE-5568).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to