This is an automated email from the ASF dual-hosted git repository.
gabor pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/parquet-format.git
The following commit(s) were added to refs/heads/master by this push:
new 2b38663 PARQUET-1487: Do not write original type for
timezone-agnostic timestamps (#125)
2b38663 is described below
commit 2b38663a28ccd4156319c0bf7ae4e6280e0c6e2d
Author: Zoltan Ivanfi <[email protected]>
AuthorDate: Wed Jan 9 13:35:34 2019 +0100
PARQUET-1487: Do not write original type for timezone-agnostic timestamps
(#125)
Clarify in the comments that we should only map the new TIMESTAMP type
to the old TIMESTAMP_MILLIS or TIMESTAMP_MICROS types when the semantics
match (UTC normalized and the precision matches).
---
LogicalTypes.md | 60 +++++++++++++++++++++++++++++++++++-------
src/main/thrift/parquet.thrift | 11 ++++++--
2 files changed, 59 insertions(+), 12 deletions(-)
diff --git a/LogicalTypes.md b/LogicalTypes.md
index 4c103e2..be8734a 100644
--- a/LogicalTypes.md
+++ b/LogicalTypes.md
@@ -274,11 +274,13 @@ The sort order used for `TIME` is signed.
#### Deprecated time ConvertedType
-`TIME_MILLIS` is the deprecated ConvertedType counterpart of `TIME` logical
type
-with precision `MILLIS`. Like the logical type counterpart, it must annotate
an `int32`
+`TIME_MILLIS` is the deprecated ConvertedType counterpart of a `TIME` logical
+type that is UTC normalized and has `MILLIS` precision. Like the logical type
+counterpart, it must annotate an `int32`.
-`TIME_MICROS` is the deprecated ConvertedType counterpart of `TIME` logical
type
-with precision `MICROS`. Like the logical type counterpart, it must annotate
an `int64`
+`TIME_MICROS` is the deprecated ConvertedType counterpart of a `TIME` logical
+type that is UTC normalized and has `MICROS` precision. Like the logical type
+counterpart, it must annotate an `int64`.
*Backward compatibility:*
@@ -295,7 +297,8 @@ with precision `MICROS`. Like the logical type counterpart,
it must annotate an
<th>ConvertedType</th>
</tr>
<tr>
- <td rowspan="2" colspan="2">TimeType</td>
+ <td rowspan="6">TimeType</td>
+ <td rowspan="3">isAdjustedToUTC = true</td>
<td>unit = MILLIS</td>
<td>TIME_MILLIS</td>
</tr>
@@ -303,6 +306,23 @@ with precision `MICROS`. Like the logical type
counterpart, it must annotate an
<td>unit = MICROS</td>
<td>TIME_MICROS</td>
</tr>
+ <tr>
+ <td>unit = NANOS</td>
+ <td>-</td>
+ </tr>
+ <tr>
+ <td rowspan="3">isAdjustedToUTC = false</td>
+ <td>unit = MILLIS</td>
+ <td>-</td>
+ </tr>
+ <tr>
+ <td>unit = MICROS</td>
+ <td>-</td>
+ </tr>
+ <tr>
+ <td>unit = NANOS</td>
+ <td>-</td>
+ </tr>
</table>
### TIMESTAMP
@@ -329,11 +349,13 @@ The sort order used for `TIMESTAMP` is signed.
#### Deprecated timestamp ConvertedType
-`TIMESTAMP_MILLIS` is the deprecated ConvertedType counterpart of `TIMESTAMP`
logical type
-with precision `MILLIS`. Like the logical type counterpart, it must annotate
an `int64`
+`TIMESTAMP_MILLIS` is the deprecated ConvertedType counterpart of a `TIMESTAMP`
+logical type that is UTC normalized and has `MILLIS` precision. Like the
logical
+type counterpart, it must annotate an `int64`.
-`TIMESTAMP_MICROS` is the deprecated ConvertedType counterpart of `TIMESTAMP`
logical type
-with precision `MICROS`. Like the logical type counterpart, it must annotate
an `int64`
+`TIMESTAMP_MICROS` is the deprecated ConvertedType counterpart of a `TIMESTAMP`
+logical type that is UTC normalized and has `MICROS` precision. Like the
logical
+type counterpart, it must annotate an `int64`.
*Backward compatibility:*
@@ -350,7 +372,8 @@ with precision `MICROS`. Like the logical type counterpart,
it must annotate an
<th>ConvertedType</th>
</tr>
<tr>
- <td rowspan="2" colspan="2">TimestampType</td>
+ <td rowspan="6">TimestampType</td>
+ <td rowspan="3">isAdjustedToUTC = true</td>
<td>unit = MILLIS</td>
<td>TIMESTAMP_MILLIS</td>
</tr>
@@ -358,6 +381,23 @@ with precision `MICROS`. Like the logical type
counterpart, it must annotate an
<td>unit = MICROS</td>
<td>TIMESTAMP_MICROS</td>
</tr>
+ <tr>
+ <td>unit = NANOS</td>
+ <td>-</td>
+ </tr>
+ <tr>
+ <td rowspan="3">isAdjustedToUTC = false</td>
+ <td>unit = MILLIS</td>
+ <td>-</td>
+ </tr>
+ <tr>
+ <td>unit = MICROS</td>
+ <td>-</td>
+ </tr>
+ <tr>
+ <td>unit = NANOS</td>
+ <td>-</td>
+ </tr>
</table>
### INTERVAL
diff --git a/src/main/thrift/parquet.thrift b/src/main/thrift/parquet.thrift
index c195177..7a29b80 100644
--- a/src/main/thrift/parquet.thrift
+++ b/src/main/thrift/parquet.thrift
@@ -326,8 +326,15 @@ union LogicalType {
4: EnumType ENUM // use ConvertedType ENUM
5: DecimalType DECIMAL // use ConvertedType DECIMAL
6: DateType DATE // use ConvertedType DATE
- 7: TimeType TIME // use ConvertedType TIME_MICROS or TIME_MILLIS
- 8: TimestampType TIMESTAMP // use ConvertedType TIMESTAMP_MICROS or
TIMESTAMP_MILLIS
+
+ // use ConvertedType TIME_MICROS for TIME(isAdjustedToUTC = true, unit =
MICROS)
+ // use ConvertedType TIME_MILLIS for TIME(isAdjustedToUTC = true, unit =
MILLIS)
+ 7: TimeType TIME
+
+ // use ConvertedType TIMESTAMP_MICROS for TIMESTAMP(isAdjustedToUTC = true,
unit = MICROS)
+ // use ConvertedType TIMESTAMP_MILLIS for TIMESTAMP(isAdjustedToUTC = true,
unit = MILLIS)
+ 8: TimestampType TIMESTAMP
+
// 9: reserved for INTERVAL
10: IntType INTEGER // use ConvertedType INT_* or UINT_*
11: NullType UNKNOWN // no compatible ConvertedType