Re: [PR] [VL] Fix broadcast hash table reuse for reused exchanges [gluten]

via GitHub Sun, 14 Jun 2026 09:16:36 -0700


wecharyu commented on PR #12264:
URL: https://github.com/apache/gluten/pull/12264#issuecomment-4702328991


   > Gluten's implementation is the same as vanilla Spark's. Just out of 
curiosity, can Spark pass this failed test?
   
   Vanilla Spark does share the same hash relation, but it does not drop 
duplicates, so the two types of joins will not generate data issue. But Gluten 
native build hash table will drop duplicates for the `LEFT SEMI JOIN`, and the 
hash table is reused by the `INNER JOIN`, which cause the issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [VL] Fix broadcast hash table reuse for reused exchanges [gluten]

Reply via email to