[GitHub] [spark] agrawaldevesh commented on a change in pull request #29342: [SPARK-32399][SQL] Full outer shuffled hash join

GitBox Sun, 09 Aug 2020 19:51:53 -0700


agrawaldevesh commented on a change in pull request #29342:
URL: https://github.com/apache/spark/pull/29342#discussion_r467667250




##########
File path: sql/core/src/test/scala/org/apache/spark/sql/JoinSuite.scala
##########
@@ -1188,4 +1188,42 @@ class JoinSuite extends QueryTest with 
SharedSparkSession with AdaptiveSparkPlan
         classOf[BroadcastNestedLoopJoinExec]))
     }
   }
+
+  test("SPARK-32399: Full outer shuffled hash join") {

Review comment:
       I didn't fully follow: Do you mean that until 
https://issues.apache.org/jira/browse/SPARK-32577 is fixed we don't have very 
high confidence in this optimization truly producing the same results as 
without it ?
   
   All I am trying to ascertain is whether this optimization is safe in all 
cases: Would the results produced for full outer join be identical both with 
and without this optimization ? My understanding is that currently, this is 
only validated by the above scala unit tests that have been newly added, but it 
hasn't been fully validated for all full-outer-join scenarios due to 
https://issues.apache.org/jira/browse/SPARK-32577. Is that an accurate 
understanding ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] agrawaldevesh commented on a change in pull request #29342: [SPARK-32399][SQL] Full outer shuffled hash join

Reply via email to