KurtYoung commented on a change in pull request #10212:
[FLINK-14806][table-planner-blink] Add setTimestamp/getTimestamp inte…
URL: https://github.com/apache/flink/pull/10212#discussion_r347710222
##########
File path:
flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/dataformat/vector/VectorizedColumnBatch.java
##########
@@ -132,4 +135,43 @@ public Decimal getDecimal(int rowId, int colId, int
precision, int scale) {
return Decimal.fromUnscaledBytes(precision, scale,
bytes);
}
}
+
+ public SqlTimestamp getTimestamp(int rowId, int colId, int precision) {
+ if (isNullAt(rowId, colId)) {
+ return null;
+ }
+
+ // The precision of Timestamp in parquet should be one of
MILLIS, MICROS or NANOS.
+ //
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#timestamp
+ //
+ // For MILLIS, the underlying INT64 holds milliseconds
+ // For MICROS, the underlying INT64 holds microseconds
+ // For NANOS, the underlying INT96 holds nanoOfDay(8 bytes) and
julianDay(4 bytes)
+ if (columns[colId] instanceof TimestampColumnVector) {
Review comment:
I don't quite understand what's going on here. Looks like we have our own
`TimestampColumnVector`, and use this to deal with some situations. But we also
have other shortcuts for SqlTimestamp? Did we setup the rule or we follow the
rule of parquet, like use a long vector for timestamp with precision less than
3? If we use parquet's protocol, what will happen when we reading from orc?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services