[GitHub] [spark] cloud-fan commented on a change in pull request #32863: [SPARK-35652][SQL] joinWith on two table generated from same one

GitBox Mon, 14 Jun 2021 00:32:16 -0700


cloud-fan commented on a change in pull request #32863:
URL: https://github.com/apache/spark/pull/32863#discussion_r649707269




##########
File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
##########
@@ -1161,6 +1161,28 @@ class Dataset[T] private[sql](
       throw new AnalysisException("Invalid join type in joinWith: " + 
joined.joinType.sql)
     }
 
+    // If auto self join alias is enable
+    if (sqlContext.conf.dataFrameSelfJoinAutoResolveAmbiguity) {

Review comment:
       is this code repeated somewhere? and can we move it to a function to 
share code?

##########
File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
##########
@@ -1088,26 +1112,9 @@ class Dataset[T] private[sql](
     // Otherwise, find the trivially true predicates and automatically 
resolves them to both sides.
     // By the time we get here, since we have already run analysis, all 
attributes should've been
     // resolved and become AttributeReference.
-    val resolver = sparkSession.sessionState.analyzer.resolver

Review comment:
       we don't look at the `dataFrameSelfJoinAutoResolveAmbiguity` config 
here. Why do we check the config in `joinWith`?

##########
File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
##########
@@ -1034,6 +1034,31 @@ class Dataset[T] private[sql](
    */
   def join(right: Dataset[_], joinExprs: Column): DataFrame = join(right, 
joinExprs, "inner")
 
+  /**
+   * find the trivially true predicates and automatically resolves them to 
both sides.
+   */
+  

Review comment:
       nit: remove the blank line here




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cloud-fan commented on a change in pull request #32863: [SPARK-35652][SQL] joinWith on two table generated from same one

Reply via email to