Fokko commented on PR #2554:
URL: https://github.com/apache/avro/pull/2554#issuecomment-1775096258

   > The problem is that in the Avro data format, there is no timezone data in 
both types. It is left to the application to provide it, e.g. by storing it in 
a sibling field.
   
   This is correct. The Hive doc also mentioned the [TIMESTAMP WITH TIME 
ZONE](https://cwiki.apache.org/confluence/display/Hive/Different+TIMESTAMP+types)
 that captures this behavior. 
   
   > The best would be the timestamp-xyz types to encode both the long and the 
timezone in one field. Then the Avro SDKs could provide help with deserializing 
it to language-specific classes, e.g. OffsetDateTime in Java.
   
   I don't agree about storing the timezone. I don't think that is something 
that should be done by Avro because it is a huge can of worms. In general, to 
maintain engineers' sanity, it is best to normalize everything to UTC when 
storing the data. 
   
   Having a way to store the timezone as well would require to:
   - Apply the timezone first before being able to do any comparison on it
   - Handle daylight savings? (Yes, we're in that time of year again).
   - Historical changes to the timestamps as @KalleOlaviNiemitalo already 
pointed out.
   
   If we want this, this should be a separate proposal and would introduce new 
types because we can't alter the existing ones as you already mentioned.
   
   > My point is that we are going to add more unnecessary pair to the current 
spec :-/
   
   Fair question, it would allow people who use existing `local` timestamps to 
migrate to nano precision. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to