[
https://issues.apache.org/jira/browse/SPARK-47217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Asif updated SPARK-47217:
-------------------------
Description:
In case of some flavours of self join queries or nested joins involving
repetition of relation, the projected columns when passed to the
DataFrame.select API , as form of df.column , can result in plan resolution
failure due to attribute resolution not happening.
A scenario in which this happens is
{noformat}
Project ( dataframe A.column("col-a") )
|
Join2
| |
Join1 DataFrame A
|
DataFrame A DataFrame B
{noformat}
In such cases, If it so happens that Join2 - right leg DataFrame A gets
re-aliased due to De-Duplication of relations, and if the project uses Column
definition obtained from DataFrame A, its exprId will not match the re-aliased
Join2 - right Leg- DataFrame A , causing resolution failure.
was:
In case of some flavours of nested self join queries, the projected columns
when passed to the DataFrame.select API , as form of df.column , can result
in plan resolution failure due to attribute resolution not happening.
A scenario in which this happens is
{noformat}
Project ( dataframe A.column("col-a") )
|
Join2
| |
Join1 DataFrame A
|
DataFrame A DataFrame B
{noformat}
In such cases, If it so happens that Join2 - right leg DataFrame A gets
re-aliased due to De-Duplication of relations, and if the project uses Column
definition obtained from DataFrame A, its exprId will not match the re-aliased
Join2 - right Leg- DataFrame A , causing resolution failure.
> De-duplication of Relations in Joins, can result in plan resolution failure
> ---------------------------------------------------------------------------
>
> Key: SPARK-47217
> URL: https://issues.apache.org/jira/browse/SPARK-47217
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.5.1
> Reporter: Asif
> Priority: Major
> Labels: Spark-SQL
>
> In case of some flavours of self join queries or nested joins involving
> repetition of relation, the projected columns when passed to the
> DataFrame.select API , as form of df.column , can result in plan resolution
> failure due to attribute resolution not happening.
> A scenario in which this happens is
> {noformat}
>
> Project ( dataframe A.column("col-a") )
> |
> Join2
> | |
> Join1 DataFrame A
> |
> DataFrame A DataFrame B
> {noformat}
> In such cases, If it so happens that Join2 - right leg DataFrame A gets
> re-aliased due to De-Duplication of relations, and if the project uses Column
> definition obtained from DataFrame A, its exprId will not match the
> re-aliased Join2 - right Leg- DataFrame A , causing resolution failure.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]