[
https://issues.apache.org/jira/browse/SPARK-52488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wenchen Fan reassigned SPARK-52488:
-----------------------------------
Assignee: Mihailo Aleksic
> Strip alias before wrapping outer references under HAVING
> ---------------------------------------------------------
>
> Key: SPARK-52488
> URL: https://issues.apache.org/jira/browse/SPARK-52488
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 4.1.0
> Reporter: Mihailo Aleksic
> Assignee: Mihailo Aleksic
> Priority: Major
>
> For the following query:
> SELECT col1 AS alias
> FROM values(named_struct('a', 1))
> GROUP BY col1
> HAVING (
> SELECT col1.a = 1
> );
> this is the resulting analyzed plan:
> Filter cast(scalar-subquery#8847 [alias#8846] as boolean)
> : +- Project [(outer(alias#8846).a = 1) AS (outer(col1).a AS a = 1)#8867]
> : +- OneRowRelation
> +- Aggregate [col1#8865], [col1#8865 AS alias#8846]
> +- LocalRelation [col1#8865]
> As it can be seen, we have outer(col1).a AS a in the Alias name for col1.a =
> 1 which is redundant and should be removed. It doesn't affect the output
> schema so changing the Alias name here is safe.
> After the change, plan looks like:
> Filter cast(scalar-subquery#x [alias#x] as boolean)
> : +- Project [(outer(alias#x).a = 1) AS (outer(col1).a = 1)#x]
> : +- OneRowRelation
> +- Aggregate [col1#x], [col1#x AS alias#x]
> +- LocalRelation [col1#x]
> This change is needed to keep the compatibility between fixed-point and
> single-pass implementations.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]