[
https://issues.apache.org/jira/browse/PARQUET-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16961207#comment-16961207
]
Xinli Shang commented on PARQUET-1685:
--------------------------------------
Sounds good [~gszadovszky] and [~rdblue]. I will broadcast in 'dev' email list
to see if there is any objection, or if people knows any application relying on
the real value of min/max value.
> Truncate the stored min and max for String statistics to reduce the footer
> size
> --------------------------------------------------------------------------------
>
> Key: PARQUET-1685
> URL: https://issues.apache.org/jira/browse/PARQUET-1685
> Project: Parquet
> Issue Type: Improvement
> Components: parquet-mr
> Affects Versions: 1.10.1
> Reporter: Xinli Shang
> Assignee: Xinli Shang
> Priority: Major
> Fix For: 1.12.0
>
>
> Iceberg has a cool feature that truncates the stored min, max statistics to
> minimize the metadata size. We can borrow to truncate them in Parquet also to
> reduce the size of the footer, or even the page header. Here is the code in
> IceBerg
> [https://github.com/apache/incubator-iceberg/blob/master/api/src/main/java/org/apache/iceberg/util/UnicodeUtil.java].
>
>
>
>
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)