zhangyue-hashdata commented on code in PR #724:
URL: https://github.com/apache/cloudberry/pull/724#discussion_r2009429137
##########
src/backend/executor/nodeSeqscan.c:
##########
@@ -87,8 +100,22 @@ SeqNext(SeqScanState *node)
/*
* get the next tuple from the table
*/
- if (table_scan_getnextslot(scandesc, direction, slot))
- return slot;
+ if (node->filter_in_seqscan && node->filters)
+ {
+ while (table_scan_getnextslot(scandesc, direction, slot))
+ {
+ if (!PassByBloomFilter(node, slot))
Review Comment:
> but bloom_create_aggresive maybe return NULL, buffer limit 1M ~2MB
There are two factors that will ensure the Bloom filter is created
successfully with a high probability. The first is that Bloom filters are
typically built on small tables. Even for tables as large as 1TB or 10TB, the
data volume of TPC-DS small tables is generally not very large. The second
factor is the number of segments; the average amount of data from small tables
distributed across each segment will also not be excessive. Considering these
two factors together, for a data volume of 10TB, it is highly likely that the
creation of the Bloom filter will not fail.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]