agrawalpooja opened a new pull request #24151: [SPARK-26739][SQL][WIP] Standardized Join Types for DataFrames URL: https://github.com/apache/spark/pull/24151 ## What changes were proposed in this pull request? Tries the address the concern mentioned in [SPARK-26739](https://issues.apache.org/jira/browse/SPARK-26739) To summarise, currently, in the join functions on DataFrames, the join types are defined via a string parameter called joinType. In order for a developer to know which joins are possible, they must look up the API call for join. While this works fine, it can cause the developer to make a typo resulting in improper joins and/or unexpected errors that aren't evident at compile time. The objective of this improvement would be to allow developers to use a common definition for join types (by enum or constants) called JoinTypes. This would contain the possible joins and remove the possibility of a typo. It would also allow Spark to alter the names of the joins in the future without impacting end-users. ## How was this patch tested? Tested via Unit tests
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
