Xinli Shang created PARQUET-1901:
------------------------------------

             Summary: Add filter null check for ColumnIndex  
                 Key: PARQUET-1901
                 URL: https://issues.apache.org/jira/browse/PARQUET-1901
             Project: Parquet
          Issue Type: Bug
          Components: parquet-mr
    Affects Versions: 1.11.0
            Reporter: Xinli Shang
            Assignee: Xinli Shang
             Fix For: 1.12.0


This Jira is opened for discussion that should we add null checking for the 
filter when ColumnIndex is enabled. 

In the ColumnIndexFilter#calculateRowRanges() method, the input parameter 
'filter' is assumed to be non-null without checking. It throws NPE when 
ColumnIndex is enabled(by default) but there is no filter set in the 
ParquetReadOptions. The call stack is as below. 
    java.lang.NullPointerException
        at 
org.apache.parquet.internal.filter2.columnindex.ColumnIndexFilter.calculateRowRanges(ColumnIndexFilter.java:81)
        at 
org.apache.parquet.hadoop.ParquetFileReader.getRowRanges(ParquetFileReader.java:961)
        at 
org.apache.parquet.hadoop.ParquetFileReader.readNextFilteredRowGroup(ParquetFileReader.java:891)

If we don't add, the user might need to choose to call readNextRowGroup() or 
readFilteredNextRowGroup() accordingly based on filter existence. 

Thoughts?  




  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to