szehon-ho commented on code in PR #51091:
URL: https://github.com/apache/spark/pull/51091#discussion_r2151060185
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/MergeRowsExec.scala:
##########
@@ -233,6 +246,7 @@ case class MergeRowsExec(
}
}
+ longMetric("numTargetRowsCopied") += 1
Review Comment:
Thanks @juliuszsompolski for the test case!
I think the duplicate comes from the fact that applyInstructions() handles
all three cases:
- matched
- not matched
- not matched by source
I added an extra filter to not have notMatched rows increment. It doesnt
make sense as these are source rows only, and metric is about target.
I think we still need the case in the end, because the keepCarryOverRows is
a logic only for Group Based merge. In Delta Based merge, the matched
instructions do not have that, and the target row falls through to the end if
it doesnt match any instruction?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]