adriangb commented on issue #19858: URL: https://github.com/apache/datafusion/issues/19858#issuecomment-3768551009
> My bet is that we’re seeing the **cost of evaluation on the probe side (2)**, specifically in cases where the dynamic filter has low selectivity. I think that's a reasonable guess! Since this is all on known open datasets (TPCH) maybe we could verify these sort of assumptions? E.g. measure the selectivity, creation and evaluation time of the filters, trying to [tweak parallelism](https://github.com/apache/datafusion/pull/19639#issuecomment-3722465145) and any other possible knobs / causes and make a report of the impact each seems to have. I.e. let's be a bit scientific here: come up with multiple hypothesis, try to prove/disprove them with data, take the findings and design solution(s) and only then try to tackle them (or have an LLM tackle them). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
