Re: [I] Pushing down HashJoinExec build side dynamic filters makes tpch queries slower [datafusion]

via GitHub Mon, 19 Jan 2026 06:15:32 -0800


adriangb commented on issue #19858:
URL: https://github.com/apache/datafusion/issues/19858#issuecomment-3768551009


   > My bet is that we’re seeing the **cost of evaluation on the probe side 
(2)**, specifically in cases where the dynamic filter has low selectivity.
   
   I think that's a reasonable guess! Since this is all on known open datasets 
(TPCH) maybe we could verify these sort of assumptions? E.g. measure the 
selectivity, creation and evaluation time of the filters, trying to [tweak 
parallelism](https://github.com/apache/datafusion/pull/19639#issuecomment-3722465145)
 and any other possible knobs / causes and make a report of the impact each 
seems to have. I.e. let's be a bit scientific here: come up with multiple 
hypothesis, try to prove/disprove them with data, take the findings and design 
solution(s) and only then try to tackle them (or have an LLM tackle them).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] Pushing down HashJoinExec build side dynamic filters makes tpch queries slower [datafusion]

Reply via email to