HuangZhenQiu commented on PR #23511:
URL: https://github.com/apache/flink/pull/23511#issuecomment-1763633001

   > Thank you @HuangZhenQiu a lot for contributing this.
   > 
   > After reading the [Avro 
spec](https://avro.apache.org/docs/1.11.0/spec.html), I think we have wrongly 
mapped the Avro timestamp.
   > 
   > Avro spec says:
   > 
   > > Timestamp (millisecond precision)
   > > The timestamp-millis logical type represents an instant on the global 
timeline, independent of a particular time zone or calendar, with a precision 
of one millisecond. Please note that time zone information gets lost in this 
process. Upon reading a value back, we can only reconstruct the instant, but 
not the original representation. In practice, such timestamps are typically 
displayed to users in their local time zones, therefore they may be displayed 
differently depending on the execution environment.
   > > A timestamp-millis logical type annotates an Avro long, where the long 
stores the number of milliseconds from the unix epoch, 1 January 1970 
00:00:00.000 UTC.
   > 
   > [Consistent timestamp types in Hadoop SQL 
engines](https://docs.google.com/document/d/1gNRww9mZJcHvUDCXklzjFEQGpefsuR_akCDfWsdE35Q/edit)
 also pointed out:
   > 
   > > Timestamps in Avro, Parquet and RCFiles with a binary SerDe have Instant 
semantics
   > 
   > So Avro Timestamp is a Java Instant semantic that should map to Flink 
TIMESTAMP_LTZ, but currently, it maps to TIMESTAMP_NTZ.
   > 
   > On the contrary,
   > 
   > > Local timestamp (millisecond precision)
   > > The local-timestamp-millis logical type represents a timestamp in a 
local timezone, regardless of what specific time zone is considered local, with 
a precision of one millisecond.
   > > A local-timestamp-millis logical type annotates an Avro long, where the 
long stores the number of milliseconds, from 1 January 1970 00:00:00.000.
   > 
   > Avro LocalTimestamp is a Java LocalDateTime semantic that should map to 
Flink TIMESTAMP_NTZ.
   > 
   > If we agree with this behavior, we may need to open a discussion in the 
dev ML about how to correct the behavior in a backward-compatible or 
incompatible way.
   
   @wuchong Thanks for the feedback according to the hadoop alignment doc. 
Beside this, I also feel unclear on how to converting timestamp data to 
TimestampData which is the RowData internal representation. A Flink user can 
define a dynamic table with Avro format on a timestamp field with a target 
timestamp with time zone, but we we can't convert the Avro long typed data to 
the target timestamp with time zone as the target Flink type is missing in 
Converters. I would like to open a discussion in dev ML after our offline sync.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to