wgtmac commented on code in PR #1867:
URL: https://github.com/apache/orc/pull/1867#discussion_r1562011527


##########
site/specification/ORCv1.md:
##########
@@ -1170,6 +1173,35 @@ DIRECT_V2     | PRESENT         | Yes      | Boolean RLE
               | DATA            | No       | Signed Integer RLE v2
               | SECONDARY       | No       | Unsigned Integer RLE v2
 
+Due to ORC-763, values before the UNIX epoch which have nanoseconds greater
+than 999,999 are adjusted to have 1 second less.
+
+For example, given a stripe with a TIMESTAMP column with a writer timezone
+of US/Pacific, and a reader timezone of UTC, we have the decoded integer values
+of -1,440,851,103 from the DATA stream and 199,900,000 from the SECONDARY 
stream.
+
+First we must adjust the DATA value to be relative to the UNIX epoch. The ORC
+epoch is 1 January 2015 00:00:00 US/Pacific, since we must take into account 
the writer
+timezone. This translates to 1 January 2015 08:00:00 UTC, as US/Pacific is 
equivalent
+to a -08:00 offset from UTC at that date (no daylight savings). The number of 
seconds
+from 1 January 1970 00:00:00 UTC to 1 January 2015 08:00:00 UTC is 
1,420,099,200. This is
+added to the DATA value to produce a value of -20,751,903. As this is before 
the
+UNIX epoch (since it is negative), and the SECONDARY value, 199,900,000, is
+greater than 999,999, then this DATA value is adjusted to become -20,751,904
+(1 second subtracted).
+
+This value by itself represents 5 May 1969 19:34:56.1999, which now needs to 
be adjusted
+from US/Pacific (the writer's timezone) to UTC (the reader's timezone). As the 
value is
+within daylight savings for US/Pacific, 7 hours are subtracted to give the 
final value
+of 5 May 1969 12:34:56.1999.

Review Comment:
   Similarly, I'm not inclined to add these details to the specs if we are not 
100% sure about it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to