cloud-fan commented on PR #45552:
URL: https://github.com/apache/spark/pull/45552#issuecomment-2007037219
The column reference in classic Spark SQL DataFrame API is very broken and I
really don't want to add more hacks here and there to fix certain cases. In
Spark Connect, we've redesigned the column reference and it's much more
reliable and reasonable.
How about adding a config to let the classic column reference use the same
implement of spark connect's? Ideally users should update their DataFrame query
to always use named column like SQL API.
```
df1 = abc.as("df1")
df2 = xyz.as("df2")
df1.join(df2, $"df1.col" === $"df2.col")
```
But if users really want to stick with the old style, they can turn on the
config.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]