Xinli Shang created PARQUET-1901: ------------------------------------ Summary: Add filter null check for ColumnIndex Key: PARQUET-1901 URL: https://issues.apache.org/jira/browse/PARQUET-1901 Project: Parquet Issue Type: Bug Components: parquet-mr Affects Versions: 1.11.0 Reporter: Xinli Shang Assignee: Xinli Shang Fix For: 1.12.0
This Jira is opened for discussion that should we add null checking for the filter when ColumnIndex is enabled. In the ColumnIndexFilter#calculateRowRanges() method, the input parameter 'filter' is assumed to be non-null without checking. It throws NPE when ColumnIndex is enabled(by default) but there is no filter set in the ParquetReadOptions. The call stack is as below. java.lang.NullPointerException at org.apache.parquet.internal.filter2.columnindex.ColumnIndexFilter.calculateRowRanges(ColumnIndexFilter.java:81) at org.apache.parquet.hadoop.ParquetFileReader.getRowRanges(ParquetFileReader.java:961) at org.apache.parquet.hadoop.ParquetFileReader.readNextFilteredRowGroup(ParquetFileReader.java:891) If we don't add, the user might need to choose to call readNextRowGroup() or readFilteredNextRowGroup() accordingly based on filter existence. Thoughts? -- This message was sent by Atlassian Jira (v8.3.4#803005)