[jira] [Commented] (PARQUET-1964) Add null check for getFilteredRecordCount

ASF GitHub Bot (Jira) Mon, 18 Jan 2021 21:08:07 -0800


    [ 
https://issues.apache.org/jira/browse/PARQUET-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267654#comment-17267654
 ]


ASF GitHub Bot commented on PARQUET-1964:
-----------------------------------------

wangyum commented on a change in pull request #855:
URL: https://github.com/apache/parquet-mr/pull/855#discussion_r559916428



##########
File path: 
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileReader.java
##########
@@ -827,7 +827,7 @@ public long getRecordCount() {
   }
 
   public long getFilteredRecordCount() {
-    if (!options.useColumnIndexFilter()) {
+    if (!options.useColumnIndexFilter() && options.getRecordFilter() != null) {

Review comment:
       How to reproduce this issue:
   ```scala
   val hadoopInputFile = HadoopInputFile.fromPath(new 
Path("/path/to/parquet/000.snappy.parquet"), new Configuration())
   val reader = ParquetFileReader.open(hadoopInputFile)
   val recordCount = reader.getFilteredRecordCount
   reader.close()
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


> Add null check for getFilteredRecordCount
> -----------------------------------------
>
>                 Key: PARQUET-1964
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1964
>             Project: Parquet
>          Issue Type: Improvement
>            Reporter: Yuming Wang
>            Priority: Major
>
> How to reproduce this issue:
> {code:scala}
> val hadoopInputFile = HadoopInputFile.fromPath(new 
> Path("/path/to/parquet/000.snappy.parquet"), new Configuration())
> val reader = ParquetFileReader.open(hadoopInputFile)
> val recordCount = reader.getFilteredRecordCount
> reader.close()
> {code}
> Output:
> {noformat}
> java.lang.NullPointerException was thrown.
> java.lang.NullPointerException
>       at 
> org.apache.parquet.internal.filter2.columnindex.ColumnIndexFilter.calculateRowRanges(ColumnIndexFilter.java:81)
>       at 
> org.apache.parquet.hadoop.ParquetFileReader.getRowRanges(ParquetFileReader.java:961)
>       at 
> org.apache.parquet.hadoop.ParquetFileReader.getFilteredRecordCount(ParquetFileReader.java:766)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PARQUET-1964) Add null check for getFilteredRecordCount

Reply via email to