venkata91 commented on a change in pull request #30775:
URL: https://github.com/apache/spark/pull/30775#discussion_r543610223



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
##########
@@ -1240,6 +1240,40 @@ class Dataset[T] private[sql](
     joinWith(other, condition, "inner")
   }
 
+  /**
+   * Joins this Dataset returning value of left where `condition` evaluates to 
true.
+   *
+   * This is similar to the relation `join` function with one important 
difference in the
+   * result schema. Since `joinPartial` preserves objects present on left side 
of the join, the
+   * result schema is similarly nested into one column names `_1`.
+   *
+   * This type of join can be useful both for preserving type-safety with the 
original object
+   * types as well as working with relational data where either side of the 
join has column
+   * names in common.
+   *
+   * @param other Right side of the join.
+   * @param condition Join expression.
+   * @param joinType Type of join to perform. Default `inner`. Must be one of:

Review comment:
       Should we rephrase this comment since inner join is not supported? Like 
the default type can be probably `LeftSemi` or `LeftAnti` since `Inner` is not 
supported.

##########
File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
##########
@@ -1240,6 +1240,40 @@ class Dataset[T] private[sql](
     joinWith(other, condition, "inner")
   }
 
+  /**
+   * Joins this Dataset returning value of left where `condition` evaluates to 
true.
+   *
+   * This is similar to the relation `join` function with one important 
difference in the
+   * result schema. Since `joinPartial` preserves objects present on left side 
of the join, the
+   * result schema is similarly nested into one column names `_1`.
+   *
+   * This type of join can be useful both for preserving type-safety with the 
original object
+   * types as well as working with relational data where either side of the 
join has column
+   * names in common.
+   *
+   * @param other Right side of the join.
+   * @param condition Join expression.
+   * @param joinType Type of join to perform. Default `inner`. Must be one of:
+   *                 `left_semi`, `left_anti`.
+   *
+   * @group typedrel
+   * @since 3.1.0
+   */
+  def joinPartial[U](other: Dataset[U], condition: Column, joinType: String): 
Dataset[T] = {
+    val joinedType = JoinType(joinType)
+
+    if (joinedType != LeftSemi && joinedType != LeftAnti) {
+      throw new AnalysisException("Invalid join type in joinPartial: " + 
joinedType.sql)

Review comment:
       Better to have an actionable message like in the case of other join 
types, use `joinWith or join` API.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to