[
https://issues.apache.org/jira/browse/PARQUET-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gabor Szadovszky updated PARQUET-1901:
--------------------------------------
Fix Version/s: (was: 1.12.0)
> Add filter null check for ColumnIndex
> ---------------------------------------
>
> Key: PARQUET-1901
> URL: https://issues.apache.org/jira/browse/PARQUET-1901
> Project: Parquet
> Issue Type: Bug
> Components: parquet-mr
> Affects Versions: 1.11.0
> Reporter: Xinli Shang
> Assignee: Xinli Shang
> Priority: Major
>
> This Jira is opened for discussion that should we add null checking for the
> filter when ColumnIndex is enabled.
> In the ColumnIndexFilter#calculateRowRanges() method, the input parameter
> 'filter' is assumed to be non-null without checking. It throws NPE when
> ColumnIndex is enabled(by default) but there is no filter set in the
> ParquetReadOptions. The call stack is as below.
> java.lang.NullPointerException
> at
> org.apache.parquet.internal.filter2.columnindex.ColumnIndexFilter.calculateRowRanges(ColumnIndexFilter.java:81)
> at
> org.apache.parquet.hadoop.ParquetFileReader.getRowRanges(ParquetFileReader.java:961)
> at
> org.apache.parquet.hadoop.ParquetFileReader.readNextFilteredRowGroup(ParquetFileReader.java:891)
> If we don't add, the user might need to choose to call readNextRowGroup() or
> readFilteredNextRowGroup() accordingly based on filter existence.
> Thoughts?
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)