[GitHub] [hudi] parisni commented on pull request #8716: [HUDI-6226] Support parquet native bloom filters

via GitHub Fri, 26 May 2023 07:39:21 -0700


parisni commented on PR #8716:
URL: https://github.com/apache/hudi/pull/8716#issuecomment-1564496057


   No AFAIK both iceberg and delta use bloom in the same way the current PR 
does: read time parquet push down. 
   
   Just to mention spark also provides a [join based on 
bloom](https://github.com/apache/spark/pull/35789) since 3.3 but in this cas it 
is a on the fly built bloom, and not the parquet bloom filter.
   
   On May 26, 2023 5:19:25 AM UTC, Danny Chan ***@***.***> wrote:
   >Thanks for the sharing, I think the Databricks BloomFilter index mainly 
serves as query optimization purposes right? Do they also use this to accelate 
the data skipping during data ingestion, aka the UPSERTS ?
   >
   >
   >
   >-- 
   >Reply to this email directly or view it on GitHub:
   >https://github.com/apache/hudi/pull/8716#issuecomment-1563823699
   >You are receiving this because you authored the thread.
   >
   >Message ID: ***@***.***>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] parisni commented on pull request #8716: [HUDI-6226] Support parquet native bloom filters

Reply via email to