liujiayi771 commented on code in PR #12264:
URL: https://github.com/apache/gluten/pull/12264#discussion_r3425998734
##########
backends-velox/src/main/scala/org/apache/gluten/execution/HashJoinExecTransformer.scala:
##########
@@ -197,4 +197,9 @@ case class BroadcastHashJoinContext(
buildHashTableId: String,
isNullAwareAntiJoin: Boolean = false,
bloomFilterPushdownSize: Long,
- buildHashTableTimeMetric: Option[SQLMetric] = None)
+ buildHashTableTimeMetric: Option[SQLMetric] = None) {
+ def droppedDuplicates: Boolean = {
Review Comment:
Update: after a closer look, the "filtered and non-filtered semi reuse the
same exchange" scenario is actually not reachable — if the filter references
non-key build columns, column pruning makes the build plans differ, preventing
reuse; if the filter only references key columns, the plans can match but the
hash table structure difference (`allowDuplicates` flag) doesn't affect the
result for semi/anti. So we can drop the test request. That said, aligning with
`!withFilter` is still the right call — it ensures each join gets a hash table
built for its actual native configuration rather than relying on coincidental
result equivalence.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]