Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/19568#discussion_r146974005
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala
---
@@ -585,21 +585,26 @@ case class SortMergeJoinExec(
val iterator = ctx.freshName("iterator")
val numOutput = metricTerm(ctx, "numOutputRows")
+ val joinedRow = ctx.freshName("joined")
--- End diff --
It ended up being a bit more complicated. There are two problems. The first
is what this fixes, which is that the INPUT_ROW in the codegen context points
to the wrong row. This is fixed and now has a test that fails if you uncomment
the line that sets INPUT_ROW.
The second problem is in the check for `CodegenFallback` fails to check
whether the condition supports codegen in some plans. To get the test to fail,
I had to add a projection to exercise the [path where this
happens](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala#L524).
I'll add a second commit for this problem.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]