[
https://issues.apache.org/jira/browse/PARQUET-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16962596#comment-16962596
]
ASF GitHub Bot commented on PARQUET-1685:
-----------------------------------------
shangxinli commented on pull request #696: PARQUET-1685: Truncate Min/Max for
Statistics
URL: https://github.com/apache/parquet-mr/pull/696
Make sure you have checked _all_ steps below.
### Jira
PARQUET-1685: Truncate Min/Max for Statistics
### Tests
Added unit test
Write a Spark application test and it passed
### Commits
- [ ] My commits all reference Jira issues in their subject lines. In
addition, my commits follow the guidelines from "[How to write a good git
commit message](http://chris.beams.io/posts/git-commit/)":
1. Subject is separated from body by a blank line
1. Subject is limited to 50 characters (not including Jira issue reference)
1. Subject does not end with a period
1. Subject uses the imperative mood ("add", not "adding")
1. Body wraps at 72 characters
1. Body explains "what" and "why", not "how"
### Documentation
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Truncate the stored min and max for String statistics to reduce the footer
> size
> --------------------------------------------------------------------------------
>
> Key: PARQUET-1685
> URL: https://issues.apache.org/jira/browse/PARQUET-1685
> Project: Parquet
> Issue Type: Improvement
> Components: parquet-mr
> Affects Versions: 1.10.1
> Reporter: Xinli Shang
> Assignee: Xinli Shang
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Iceberg has a cool feature that truncates the stored min, max statistics to
> minimize the metadata size. We can borrow to truncate them in Parquet also to
> reduce the size of the footer, or even the page header. Here is the code in
> IceBerg
> [https://github.com/apache/incubator-iceberg/blob/master/api/src/main/java/org/apache/iceberg/util/UnicodeUtil.java].
>
>
>
>
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)