[ https://issues.apache.org/jira/browse/PARQUET-324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ryan Blue resolved PARQUET-324. ------------------------------- Resolution: Fixed Fix Version/s: 1.8.0 Thanks for contributing the fix, [~tfriedr]! > row count incorrect if data file has more than 2^31 rows > -------------------------------------------------------- > > Key: PARQUET-324 > URL: https://issues.apache.org/jira/browse/PARQUET-324 > Project: Parquet > Issue Type: Bug > Components: parquet-mr > Affects Versions: 1.7.0, 1.8.0 > Reporter: Thomas Friedrich > Assignee: Thomas Friedrich > Priority: Minor > Fix For: 1.8.0 > > > If a parquet file has more than 2^31 rows, the row count written into the > file metadata is incorrect. > The cause of the problem is the use of an int instead of long data type for > numRows in ParquetMetadataConverter, toParquetMetadata: > int numRows = 0; > for (BlockMetaData block : blocks) { > numRows += block.getRowCount(); > addRowGroup(parquetMetadata, rowGroups, block); > } -- This message was sent by Atlassian JIRA (v6.3.4#6332)