[GitHub] [hudi] nsivabalan edited a comment on issue #2178: [SUPPORT] Hudi writing 10MB worth of org.apache.hudi.bloomfilter data in each of the parquet files produced

2020-12-11 Thread GitBox
nsivabalan edited a comment on issue #2178: URL: https://github.com/apache/hudi/issues/2178#issuecomment-711168471 I guess the small record size of 35 bytes throws it off. so, lets see what we can do. Before I go further, let me recap the SIMPLE bloom. Bloom filter will statically

[GitHub] [hudi] nsivabalan edited a comment on issue #2178: [SUPPORT] Hudi writing 10MB worth of org.apache.hudi.bloomfilter data in each of the parquet files produced

2020-10-16 Thread GitBox
nsivabalan edited a comment on issue #2178: URL: https://github.com/apache/hudi/issues/2178#issuecomment-710031904 If you wish to scale the bloom filer size along with the number of entries, you can try out dynamic bloom filter. Remember this is different from hoodie.index.type which

[GitHub] [hudi] nsivabalan edited a comment on issue #2178: [SUPPORT] Hudi writing 10MB worth of org.apache.hudi.bloomfilter data in each of the parquet files produced

2020-10-16 Thread GitBox
nsivabalan edited a comment on issue #2178: URL: https://github.com/apache/hudi/issues/2178#issuecomment-710031904 If you wish to have dynamic bloom filter that scales its size as the number of entries increase, you can try it out. Remember this is different from hoodie.index.type