[GitHub] [arrow-rs] jorgecarleitao commented on pull request #1937: Set adjusted to UTC if UTC timezone (#1932)

GitBox Mon, 27 Jun 2022 14:44:03 -0700


jorgecarleitao commented on PR #1937:
URL: https://github.com/apache/arrow-rs/pull/1937#issuecomment-1167941057


   While applying this fix in arrow2, I broke some of our integration tests 
against pyarrow. Coming to the specs, when the tz string is set in Arrow, it 
means that
   
   1. the values `i64` are time-aware
   2. the values are stored in UTC
   3. the tz string represents the offset/timezone that needs to be taken into 
account when using the `i64`
   
   In this PR's notation, the values are by definition normalized to be UTC, it 
is the user of the logical type that needs to de-normalize them if they need to 
apply offsets (e.g. to represent).
   
   AFAIK this matches with what "is adjusted to UTC" in parquet means:
   
   > A TIMESTAMP with isAdjustedToUTC=true is defined as the number of 
milliseconds, microseconds or nanoseconds (depending on the unit parameter 
being MILLIS, MICROS or NANOS, respectively) elapsed since the Unix epoch, 
1970-01-01 00:00:00 UTC. Each such value unambiguously identifies a single 
instant on the time-line.
   
   In other words, I _think_ that the previous behavior was correct (but I may 
be wrong)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-rs] jorgecarleitao commented on pull request #1937: Set adjusted to UTC if UTC timezone (#1932)

Reply via email to