[GitHub] [spark] peter-toth commented on a change in pull request #29572: [SPARK-32730][SQL] Improve LeftSemi SortMergeJoin right side buffering

GitBox Wed, 02 Sep 2020 03:15:03 -0700


peter-toth commented on a change in pull request #29572:
URL: https://github.com/apache/spark/pull/29572#discussion_r481958099




##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala
##########
@@ -673,8 +680,29 @@ private[joins] class SortMergeJoinScanner(
    */
   private[this] var matchJoinKey: InternalRow = _
   /** Buffered rows from the buffered side of the join. This is empty if there 
are no matches. */
-  private[this] val bufferedMatches =
+  private[this] val bufferedMatches: AppendOnlyUnsafeRowArray = if 
(bufferFirstOnly) {
+    new AppendOnlyUnsafeRowArray {
+      var buffer: UnsafeRow = null
+
+      override def clear(): Unit = {
+        buffer = null
+      }
+
+      override def add(row: UnsafeRow): Unit = {
+        assert(buffer == null)

Review comment:
       No, its threshold parameters do work as expected. Just 
`ExternalAppendOnlyUnsafeRowArray` looked a bit heavy weight for this case when 
we want to store only one row. But we can also use `new 
ExternalAppendOnlyUnsafeRowArray(1, 1)` for this case.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] peter-toth commented on a change in pull request #29572: [SPARK-32730][SQL] Improve LeftSemi SortMergeJoin right side buffering

Reply via email to