[GitHub] [spark] hvanhovell opened a new pull request, #40156: [SPARK-41823][CONNECT] Scala Client resolve ambiguous columns in Join

via GitHub Thu, 23 Feb 2023 20:58:22 -0800


hvanhovell opened a new pull request, #40156:
URL: https://github.com/apache/spark/pull/40156


   ### What changes were proposed in this pull request?
   This is the scala version of https://github.com/apache/spark/pull/39925.
   
   We introduce a plan_id that is both used for each plan created by the scala 
client, and by the columns created when calling `Dataframe.col(..)` and 
`Dataframe.apply(..)`. This way we can later properly resolve the columns 
created for a specific Dataframe.
   
   ### Why are the changes needed?
   Joining columns  created using Dataframe.apply(...) does not work when the 
column names are ambiguous. We should be able to figure out where a column 
comes from when they are created like this.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Updated golden files. Added test case to ClientE2ETestSuite.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] hvanhovell opened a new pull request, #40156: [SPARK-41823][CONNECT] Scala Client resolve ambiguous columns in Join

Reply via email to