Hello everyone,

I found an ambiguous situation regarding the bloom filter properties.
In the current parquet-java implementation
<https://github.com/apache/parquet-java/blob/36a5f9cf8c1ce2c19631a0ec376665c5e41ea215/parquet-column/src/main/java/org/apache/parquet/column/impl/ColumnValueCollector.java#L179-L192>,
the fpp value for a given column is always ignored if the ndv for that
column is not specified and the bloom filter uses the "bloom filter max
bytes" as the exact size instead.

What should be the correct behavior when a client defines the fpp but not
the ndv?
Should this i) keep being ignored as it is now, ii) should it give an error
or iii) should it use a default?

Best regards,
André Rosa

Reply via email to