zhli1142015 commented on PR #5401: URL: https://github.com/apache/incubator-gluten/pull/5401#issuecomment-2285374735
Hello @zhztheplayer , I think this issue is mentioned is because of #6789. Here are some obsersavation from our side for the https://github.com/facebookincubator/velox/pull/9079. It only has impact when there are duplicate rows in build side, this can be identified if hash probe output vectors > hash probe input vectors. And it would cause longer time of hash build but shorter time of hash probe. Please let me know if this is the case you observed. BTW, most time of hash probe of the join in query 95 (1TB in our test) is spent in hash join filter. Internally we have counter for it. I think the PR may help for the case: https://github.com/facebookincubator/velox/pull/10464  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
