Re: [I] Should we support other scan types than SeqScan for runtime filter? [cloudberry]

via GitHub Wed, 16 Jul 2025 18:34:18 -0700


jiaqizho commented on issue #1234:
URL: https://github.com/apache/cloudberry/issues/1234#issuecomment-3082077351


   > Yes!
   > 
   > I want to believe we one time could reduce the number of rows in 
SubqueryScan. And use Forward and Backward filter pass )
   > 
   > That's what inspires me - [Debunking the Myth of Join Ordering: Toward 
Robust SQL Analytics](https://arxiv.org/pdf/2502.15181)
   > 
   > Here the DuckDB implementation - 
[duckdb/duckdb#17326](https://github.com/duckdb/duckdb/pull/17326)
   > 
   > What I cannot understand right now - how to use bloom filter in MPP 
environment? Is it enough to create local bloom filters?
   
   @leborchuk
   
   **how to use bloom filter in MPP environment?**  As you can see the plan in 
the current issue, the `Hash Join` operator will execute in every segment. The 
bloom filter will be built by the inner table(in this plan, BF will be built by 
`t2 `).
   
   **Is it enough to create local bloom filters?** The size of the bloom filter 
is calculated from `workmem`. This is fair enough, but if the inner table is 
too large, the runtime filter will not be very effective.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] Should we support other scan types than SeqScan for runtime filter? [cloudberry]

Reply via email to