dilipbiswal commented on a change in pull request #24151:
[SPARK-26739][SQL][WIP] Standardized Join Types for DataFrames
URL: https://github.com/apache/spark/pull/24151#discussion_r267194219
##########
File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
##########
@@ -927,29 +927,28 @@ class Dataset[T] private[sql](
*
* @param right Right side of the join operation.
* @param usingColumns Names of the columns to join on. This columns must
exist on both sides.
- * @param joinType Type of join to perform. Default `inner`. Must be one of:
- * `inner`, `cross`, `outer`, `full`, `full_outer`, `left`,
`left_outer`,
- * `right`, `right_outer`, `left_semi`, `left_anti`.
- *
+ * @param joinType Type of join to perform. Default `Inner`. Must be one of
the valid JoinType:
+ * `Inner`, `Cross`, `FullOuter`, `LeftOuter`, `RightOuter`,
+ * `LeftSemi`, `LeftAnti`.
* @note If you perform a self-join using this function without aliasing the
input
* `DataFrame`s, you will NOT be able to reference any columns after the
join, since
* there is no way to disambiguate which side of the join you would like to
reference.
*
* @group untypedrel
* @since 2.0.0
*/
- def join(right: Dataset[_], usingColumns: Seq[String], joinType: String):
DataFrame = {
+ def join(right: Dataset[_], usingColumns: Seq[String], joinType: JoinType):
DataFrame = {
Review comment:
Isn't this a breaking change ? Are we allowed to do that ? cc @HyukjinKwon
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]