[ 
https://issues.apache.org/jira/browse/PARQUET-196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Swapnil updated PARQUET-196:
----------------------------
    Description: 
Parquet files contain metadata about rowcount & file size. We should have new 
commands to get rows count & size.
These command can be added in parquet-tools:
1. rowcount : This should add number of rows in all footers to give total rows 
in data. 
2. size : This should give compresses size in bytes and human readable format 
too.
These command helps us to avoid parsing job logs or loading data once again to 
find number of rows in data. This comes very handy in complex processes, stats 
generation, QA etc..

  was:
Parquet files contain metadata about rowcount & file size. We should have new 
commands to get rows count & size.
These command can be added in parquet-tools:
1. rowcount : This should add number of rows in all footers to give total rows 
in data. 
2. size : This should give compresses size in bytes and human readable format.
These command helps us to avoid parsing job logs or loading data once again to 
find number of rows in data. This comes very handy in complex processes, stats 
generation, QA etc..


> parquet-tools command to get rowcount & size
> --------------------------------------------
>
>                 Key: PARQUET-196
>                 URL: https://issues.apache.org/jira/browse/PARQUET-196
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-mr
>    Affects Versions: parquet-mr_1.6.0
>            Reporter: Swapnil
>            Priority: Minor
>              Labels: features
>             Fix For: parquet-mr_1.6.0
>
>   Original Estimate: 10m
>  Remaining Estimate: 10m
>
> Parquet files contain metadata about rowcount & file size. We should have new 
> commands to get rows count & size.
> These command can be added in parquet-tools:
> 1. rowcount : This should add number of rows in all footers to give total 
> rows in data. 
> 2. size : This should give compresses size in bytes and human readable format 
> too.
> These command helps us to avoid parsing job logs or loading data once again 
> to find number of rows in data. This comes very handy in complex processes, 
> stats generation, QA etc..



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to