Zoltan Ivanfi created PARQUET-1065:
--------------------------------------
Summary: Deprecate type-defined sort ordering for INT96 type
Key: PARQUET-1065
URL: https://issues.apache.org/jira/browse/PARQUET-1065
Project: Parquet
Issue Type: Bug
Reporter: Zoltan Ivanfi
Assignee: Zoltan Ivanfi
[parquet.thrift in
parquet-format|https://github.com/apache/parquet-format/blob/041708da1af52e7cb9288c331b542aa25b68a2b6/src/main/thrift/parquet.thrift#L37]
defines the the sort order for INT96 to be signed.
[ParquetMetadataConverter.java in
parquet-mr|https://github.com/apache/parquet-mr/blob/352b906996f392030bfd53b93e3cf4adb78d1a55/parquet-hadoop/src/main/java/org/apache/parquet/format/converter/ParquetMetadataConverter.java#L422]
uses unsigned ordering instead. In practice, INT96 is only used for timestamps
and neither signed nor unsigned ordering of the numeric values is correct for
this purpose. For this reason, the INT96 sort order should be specified as
undefined.
(As a special case, min == max signifies that all values are the same, and can
be considered valid even for undefined orderings.)
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)