agrawalpooja opened a new pull request #24151: [SPARK-26739][SQL][WIP] 
Standardized Join Types for DataFrames
URL: https://github.com/apache/spark/pull/24151
 
 
   ## What changes were proposed in this pull request?
   Tries the address the concern mentioned in 
[SPARK-26739](https://issues.apache.org/jira/browse/SPARK-26739)
   To summarise, currently, in the join functions on DataFrames, the join types 
are defined via a string parameter called joinType. In order for a developer to 
know which joins are possible, they must look up the API call for join. While 
this works fine, it can cause the developer to make a typo resulting in 
improper joins and/or unexpected errors that aren't evident at compile time. 
The objective of this improvement would be to allow developers to use a common 
definition for join types (by enum or constants) called JoinTypes. This would 
contain the possible joins and remove the possibility of a typo. It would also 
allow Spark to alter the names of the joins in the future without impacting 
end-users.
   
   ## How was this patch tested?
   Tested via Unit tests
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to