[ https://issues.apache.org/jira/browse/PARQUET-686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16195435#comment-16195435 ]
Ryan Blue commented on PARQUET-686: ----------------------------------- Merged PR #55 with some follow-ups. Thanks for the fix, [~zi]! > Allow for Unsigned Statistics in Binary Type > -------------------------------------------- > > Key: PARQUET-686 > URL: https://issues.apache.org/jira/browse/PARQUET-686 > Project: Parquet > Issue Type: Bug > Components: parquet-format > Reporter: Andrew Duffy > Assignee: Ryan Blue > Fix For: 1.9.0 > > > BinaryStatistics currently only have a min/max, which are compared as signed > {{byte[]}}. However, for real UTF8-friendly lexicographic comparison, e.g. > for string columns, we would want to calculate the BinaryStatistics based off > of a comparator that treats the bytes as unsigned. -- This message was sent by Atlassian JIRA (v6.4.14#64029)