Walid Gara created PARQUET-1787:
-----------------------------------
Summary: Expected distinct numbers is not parsed correctly
Key: PARQUET-1787
URL: https://issues.apache.org/jira/browse/PARQUET-1787
Project: Parquet
Issue Type: Bug
Components: parquet-mr
Reporter: Walid Gara
In the bloom filter feature, when I pass the expected distinct numbers as
below, I got null values instead of 1000 and 200.
{code:java}
import org.apache.hadoop.conf.Configuration;
Configuration conf = new Configuration();
conf.set("parquet.bloom.filter.column.names", "content,line");
conf.set("parquet.bloom.filter.expected.ndv","1000,200");
{code}
The issue is coming from getting the system property of expected distinct
numbers through
[Long.getLong(expectedNDVs[i])|https://github.com/apache/parquet-mr/blob/a737141a571e3cb6cee2c252dc4406e26e6c1177/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetOutputFormat.java#L251].
It's possible to fix it by parsing the string with
Long.parseLong(expectedNDVs[i]).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)