[
https://issues.apache.org/jira/browse/PARQUET-386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16735715#comment-16735715
]
ASF GitHub Bot commented on PARQUET-386:
----------------------------------------
gszadovszky commented on pull request #279: PARQUET-386: Printing out the
statistics of metadata in parquet-tools
URL: https://github.com/apache/parquet-mr/pull/279
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Printing out the statistics of metadata in parquet-tools
> --------------------------------------------------------
>
> Key: PARQUET-386
> URL: https://issues.apache.org/jira/browse/PARQUET-386
> Project: Parquet
> Issue Type: Improvement
> Components: parquet-mr
> Reporter: Onur Soyer
> Assignee: Gabor Szadovszky
> Priority: Trivial
> Labels: pull-request-available
> Fix For: 1.10.0
>
>
> While playing with "parquet-tools", I found that the statistics data of
> columns is not being printed out when the following is executed;
> $ java -jar parquet-tools-1.6.0rc3-SNAPSHOT.jar schema --detailed
> perf.1000.parquet
> And the output for a row group like this;
> =====================================================================================================================
> row group 1: RC:747388 TS:134218473 OFFSET:4
> --------------------------------------------------------------------------------
> cust_key: INT64 UNCOMPRESSED DO:0 FPO:4 SZ:5979444/5979444/1.00 VC:747388
> ENC:PLAIN,RLE,BIT_PACKED
> name: BINARY UNCOMPRESSED DO:0 FPO:5979448 SZ:16443766/16443766/1.00
> VC:747388 ENC:PLAIN,RLE,BIT_PACKED
> address: BINARY UNCOMPRESSED DO:0 FPO:22423214 SZ:21716568/21716568/1.00
> VC:747388 ENC:PLAIN,RLE,BIT_PACKED
> nation_key: INT32 UNCOMPRESSED DO:0 FPO:44139782 SZ:2989697/2989697/1.00
> VC:747388 ENC:PLAIN,RLE,BIT_PACKED
> phone: BINARY UNCOMPRESSED DO:0 FPO:47129479 SZ:14201364/14201364/1.00
> VC:747388 ENC:PLAIN,RLE,BIT_PACKED
> acctbal: DOUBLE UNCOMPRESSED DO:0 FPO:61330843 SZ:5979444/5979444/1.00
> VC:747388 ENC:PLAIN,RLE,BIT_PACKED
> mktsegment: BINARY UNCOMPRESSED DO:0 FPO:67310287 SZ:9714675/9714675/1.00
> VC:747388 ENC:PLAIN,RLE,BIT_PACKED
> comment_col: BINARY UNCOMPRESSED DO:0 FPO:77024962 SZ:57193515/57193515/1.00
> VC:747388 ENC:PLAIN,RLE,BIT_PACKED
> =====================================================================================================================
> However, it would be great to print out the data of statistics of metadata.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)