[ https://issues.apache.org/jira/browse/PARQUET-196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15572615#comment-15572615 ]
Shannon Carey commented on PARQUET-196: --------------------------------------- For those who are looking for this feature, when you run the parquet-tools "meta" command, you can find the row count after "RC:", for example: "row group 1: *RC:2040356* ..." > parquet-tools command to get rowcount & size > -------------------------------------------- > > Key: PARQUET-196 > URL: https://issues.apache.org/jira/browse/PARQUET-196 > Project: Parquet > Issue Type: Bug > Components: parquet-mr > Affects Versions: 1.6.0 > Reporter: Swapnil > Priority: Minor > Labels: features > Fix For: 1.6.0 > > Original Estimate: 10m > Remaining Estimate: 10m > > Parquet files contain metadata about rowcount & file size. We should have new > commands to get rows count & size. > These command can be added in parquet-tools: > 1. rowcount : This should add number of rows in all footers to give total > rows in data. > 2. size : This should give compresses size in bytes and human readable format > too. > These command helps us to avoid parsing job logs or loading data once again > to find number of rows in data. This comes very handy in complex processes, > stats generation, QA etc.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)