frankliee commented on PR #294:
URL: 
https://github.com/apache/incubator-uniffle/pull/294#issuecomment-1303176316

   > > Maybe we do not need precision skipping ?
   > > We could use the set of (expectBlockIds - processBlockIds) to build a 
bloomfilter.
   > > The blocks that does not fit bloomfilter can be skipped.
   > 
   > I think precision skipping is better. And i will send one time for a same 
`MemoryClientReadHandler`
   
   The client side already have precision skipping. The bitmap of all blockIds 
can still be very large for data skew.
   Coarse-grained skipping has been widely, such as parquet,  spark runtime 
filter and clickhouse. 
   Besides bloomfilter, the min-max of blockIds can also be a potential option.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to