[ https://issues.apache.org/jira/browse/PARQUET-196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Julien Le Dem resolved PARQUET-196. ----------------------------------- Resolution: Fixed Fix Version/s: (was: 1.6.0) 1.10.0 Issue resolved by pull request 406 [https://github.com/apache/parquet-mr/pull/406] > parquet-tools command to get rowcount & size > -------------------------------------------- > > Key: PARQUET-196 > URL: https://issues.apache.org/jira/browse/PARQUET-196 > Project: Parquet > Issue Type: Bug > Components: parquet-mr > Affects Versions: 1.6.0 > Reporter: Swapnil > Priority: Minor > Labels: features > Fix For: 1.10.0 > > Original Estimate: 10m > Remaining Estimate: 10m > > Parquet files contain metadata about rowcount & file size. We should have new > commands to get rows count & size. > These command can be added in parquet-tools: > 1. rowcount : This should add number of rows in all footers to give total > rows in data. > 2. size : This should give compresses size in bytes and human readable format > too. > These command helps us to avoid parsing job logs or loading data once again > to find number of rows in data. This comes very handy in complex processes, > stats generation, QA etc.. -- This message was sent by Atlassian JIRA (v6.3.15#6346)