maheshk114 commented on PR #41860: URL: https://github.com/apache/spark/pull/41860#issuecomment-1630198985
> @maheshk114 Thank you for the description. You display a case that have better performance. It tell me it's worth to consider. But I guess apply the runtime filter on the small side will causes regression in some scenarios. We can use TPC-DS to find if exists cases regressed. @beliefer yes that's true, if the creation side is not small enough then query will end up spending more time building the bloom filter. For this, there is a check already present which does not allow bloom filter to be added if the small table (build side) is greater than bloomFilter.creationSideThreshold. For TPCH and TPCDS benchmark I have not seen any regression or improvement. Even the run done by @oss-maker shows that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
