Fokko commented on PR #2554: URL: https://github.com/apache/avro/pull/2554#issuecomment-1775096258
> The problem is that in the Avro data format, there is no timezone data in both types. It is left to the application to provide it, e.g. by storing it in a sibling field. This is correct. The Hive doc also mentioned the [TIMESTAMP WITH TIME ZONE](https://cwiki.apache.org/confluence/display/Hive/Different+TIMESTAMP+types) that captures this behavior. > The best would be the timestamp-xyz types to encode both the long and the timezone in one field. Then the Avro SDKs could provide help with deserializing it to language-specific classes, e.g. OffsetDateTime in Java. I don't agree about storing the timezone. I don't think that is something that should be done by Avro because it is a huge can of worms. In general, to maintain engineers' sanity, it is best to normalize everything to UTC when storing the data. Having a way to store the timezone as well would require to: - Apply the timezone first before being able to do any comparison on it - Handle daylight savings? (Yes, we're in that time of year again). - Historical changes to the timestamps as @KalleOlaviNiemitalo already pointed out. If we want this, this should be a separate proposal and would introduce new types because we can't alter the existing ones as you already mentioned. > My point is that we are going to add more unnecessary pair to the current spec :-/ Fair question, it would allow people who use existing `local` timestamps to migrate to nano precision. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
