adamhooper commented on a change in pull request #10629:
URL: https://github.com/apache/arrow/pull/10629#discussion_r661606855



##########
File path: format/Schema.fbs
##########
@@ -218,8 +218,34 @@ table Time {
 /// leap seconds, as a 64-bit integer. Note that UNIX time does not include
 /// leap seconds.
 ///
-/// The Timestamp metadata supports both "time zone naive" and "time zone
-/// aware" timestamps. Read about the timezone attribute for more detail
+/// Date & time libraries often have multiple different data types for temporal
+/// data.  In order to ease interoperability between different implementations 
the
+/// Arrow project has some recommendations for encoding these types into a 
Timestamp
+/// column.
+///
+/// An "Instant" represents a single moment in time that has no meaningful 
time zone
+/// or the time zone is unknown.  A column of Instants can also contain values 
from
+/// multiple time zones.  To encode an instant set the timezone string to 
"UTC".
+///
+/// A "ZonedDateTime" represents a single moment in time that has a meaningful
+/// reference time zone.  To encode a ZonedDateTime as a Timestamp set the 
timezone
+/// string to the name of the timezone.
+///
+/// An "OffsetDateTime" represents a single moment in time combined with a 
meaningful
+/// offset from UTC.  To encode an OffsetDateTime as a Timestamp set the 
timezone string
+/// to the numeric time zone offset string (e.g. "+03:00").
+///
+/// A "LocalDateTime" does not represent a single moment in time.  It 
represents a wall
+/// clock time combined with a date.  Because of daylight savings time there 
may multiple
+/// Instants that correspond to a single LocalDateTime in any given time zone. 
 A
+/// LocalDateTime is often stored as a struct or a Date32/Time64 pair.  
However, it can
+/// also be encoded into a Timestamp column.  To do so the value should be the 
the time
+/// elapsed from the Unix epoch so that a wall clock in UTC would display the 
desired time.
+/// Futhermore, the timezone string should be set to null.
+///
+/// In ALL cases the values of the Timestamp column should be the time elapsed 
from the

Review comment:
       It sounds like this sentence is trying to clarify, but I think 
_removing_ it might clarify more.
   
   Looking at two subsequent paragraphs in the spec:
   
   * "LocalDateTime": "... value should be the the time elapsed from the Unix 
epoch so that a wall clock in UTC would display the desired time."
   * "ALL": "value should be the time elapsed from the Unix epoch".
   
   The "ALL" sentence is grammatically correct -- yes, both descriptions 
include the literal phrase, "time elapsed from the Unix epoch." But as a spec 
reader, when I read a spec that has two strikingly similar sentences so close 
together, I ask myself: "is this clarification saying the same thing, or is it 
saying the opposite thing?" It's easy to interpret this "ALL" sentence as 
meaning the opposite thing ... until I go back and re-read three times and feel 
slightly annoyed.
   
   I was going to suggest moving this sentence to the top of the definition, 
because that would be simpler to understand. But ... it's already there.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to