[GitHub] [hudi] parisni commented on pull request #8716: [HUDI-6226] Support parquet native bloom filters

via GitHub Fri, 14 Jul 2023 12:23:29 -0700


parisni commented on PR #8716:
URL: https://github.com/apache/hudi/pull/8716#issuecomment-1636309573


   @nsivabalan
   
   There is existing spark benchmarks here. Basically 20% slower for writes and 
up to 4x for reads. 
https://github.com/apache/spark/blob/18d0a276c501a102af3e7ed251831983b9148a4f/sql/core/benchmarks/BloomFilterBenchmark-jdk11-results.txt
   
   
   As for documentation plz consider this pr 
https://github.com/apache/hudi/pull/9056
   
   On July 14, 2023 6:02:18 PM UTC, Sivabalan Narayanan ***@***.***> wrote:
   >hey @parisni : good job on the patch. Curious to know if you have any perf 
nos on this. on both write and read side. whats the perf overhead we are seeing 
on the write side and how much improvement we are seeing on the read side w/ 
the bloom filter. 
   >
   >Also, would you provide a short write up(whats this support is all about, 
how users can leverage this and whats the benefit) on this that we can use it 
in our release page? 
   >
   >-- 
   >Reply to this email directly or view it on GitHub:
   >https://github.com/apache/hudi/pull/8716#issuecomment-1636201917
   >You are receiving this because you were mentioned.
   >
   >Message ID: ***@***.***>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] parisni commented on pull request #8716: [HUDI-6226] Support parquet native bloom filters

Reply via email to