[
https://issues.apache.org/jira/browse/PARQUET-2219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17655552#comment-17655552
]
Micah Kornfield commented on PARQUET-2219:
------------------------------------------
I'm not aware of anything in the specification that prevents zero length row
groups. We can try to prevent writing them out but I think readers should be
robust to this if it isn't disallowed in the specification. For the iterator
case, it seems like the rowgroup should just be discarded and the next one
checked?
> ParquetFileReader throws a runtime exception when a file contains only
> headers and now row data
> -----------------------------------------------------------------------------------------------
>
> Key: PARQUET-2219
> URL: https://issues.apache.org/jira/browse/PARQUET-2219
> Project: Parquet
> Issue Type: Bug
> Components: parquet-mr
> Affects Versions: 1.12.1
> Reporter: chris stockton
> Priority: Minor
>
> Google BigQuery has an option to export table data to Parquet-formatted
> files, but some of these files are written with header data only. When this
> happens and these files are opened with the ParquetFileReader, an exception
> is thrown:
> {{RuntimeException("Illegal row group of 0 rows");}}
> It seems like the ParquetFileReader should not throw an exception when it
> encounters such a file.
> https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileReader.java#L949
--
This message was sent by Atlassian Jira
(v8.20.10#820010)