serramatutu commented on code in PR #48002:
URL: https://github.com/apache/arrow/pull/48002#discussion_r2481766888


##########
docs/source/format/CanonicalExtensions.rst:
##########
@@ -483,6 +483,28 @@ binary values look like.
 
 .. _variant_primitive_type_mapping:
 
+Timestamp With Offset
+=============
+This type represents a timestamp column that stores potentially different 
timezone offsets per value. The timestamp is stored in UTC alongside the 
original timezone offset in minutes.
+
+* Extension name: ``arrow.timestamp_with_offset``.
+
+* The storage type of the extension is a ``Struct`` with 2 fields, in order:
+
+  * ``timestamp``: a non-nullable ``Timestamp(time_unit, "UTC")``, where 
``time_unit`` is any Arrow ``TimeUnit`` (s, ms, us or ns).
+
+  * ``offset_minutes``: a non-nullable signed 16-bit integer (``Int16``) 
representing the offset in minutes from the UTC timezone. Negative offsets 
represent time zones west of UTC, while positive offsets represent east. 
Offsets range from -779 (-12:59) to +780 (+13:00).

Review Comment:
   @rok we just sent this to the mailing list yesterday. The [discussion 
thread](https://lists.apache.org/thread/yhbr3rj9l59yoxv92o2s6dqlop16sfnk) has a 
more extensive argumentation around why we chose these constraints.
   
   > Out of curiosity - would the proposed memory layout of match any existing 
system?
   
   The systems we're referencing are Snowflake, MS SQL Server, Oracle DB and 
Trino, of which only one of them (Trino) is open source. It's hard to know for 
a fact what is the internal memory layout of proprietary systems... We do know 
Oracle and Trino store IANA timezones instead of offsets, so the layout doesn't 
match there and some Arrow conversion layer would need to resolve the timezone 
names to offsets. 
   
   This (resolving offsets on the server) is an explicit choice so that 
consumer systems don't need to mess with the IANA database or reasoning about 
daylight savings etc. Arrow consumers just get the offset, add it to the 
timestamp and voila you have the original timestamp in the original timezone.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to