Github user sujithjay commented on a diff in the pull request: https://github.com/apache/spark/pull/22168#discussion_r236992453 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala --- @@ -1058,31 +1064,37 @@ private class SortMergeFullOuterJoinScanner( * @return true if a valid match is found, false otherwise. */ private def scanNextInBuffered(): Boolean = { - while (leftIndex < leftMatches.size) { - while (rightIndex < rightMatches.size) { - joinedRow(leftMatches(leftIndex), rightMatches(rightIndex)) - if (boundCondition(joinedRow)) { - leftMatched.set(leftIndex) - rightMatched.set(rightIndex) + val leftMatchesIterator = leftMatches.generateIterator(leftIndex) + + while (leftMatchesIterator.hasNext) { + val leftCurRow = leftMatchesIterator.next() + val rightMatchesIterator = rightMatches.generateIterator(rightIndex) --- End diff -- Hi @viirya , After some deliberation, I figure it would not be possible to avoid the reinitialisation of the right iterator. Please share your thought on this.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org