parisni commented on PR #8716: URL: https://github.com/apache/hudi/pull/8716#issuecomment-1636309573
@nsivabalan There is existing spark benchmarks here. Basically 20% slower for writes and up to 4x for reads. https://github.com/apache/spark/blob/18d0a276c501a102af3e7ed251831983b9148a4f/sql/core/benchmarks/BloomFilterBenchmark-jdk11-results.txt As for documentation plz consider this pr https://github.com/apache/hudi/pull/9056 On July 14, 2023 6:02:18 PM UTC, Sivabalan Narayanan ***@***.***> wrote: >hey @parisni : good job on the patch. Curious to know if you have any perf nos on this. on both write and read side. whats the perf overhead we are seeing on the write side and how much improvement we are seeing on the read side w/ the bloom filter. > >Also, would you provide a short write up(whats this support is all about, how users can leverage this and whats the benefit) on this that we can use it in our release page? > >-- >Reply to this email directly or view it on GitHub: >https://github.com/apache/hudi/pull/8716#issuecomment-1636201917 >You are receiving this because you were mentioned. > >Message ID: ***@***.***> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
