serramatutu commented on code in PR #48002: URL: https://github.com/apache/arrow/pull/48002#discussion_r2544923839
########## docs/source/format/CanonicalExtensions.rst: ########## @@ -483,6 +483,28 @@ binary values look like. .. _variant_primitive_type_mapping: +Timestamp With Offset +============= +This type represents a timestamp column that stores potentially different timezone offsets per value. The timestamp is stored in UTC alongside the original timezone offset in minutes. + +* Extension name: ``arrow.timestamp_with_offset``. + +* The storage type of the extension is a ``Struct`` with 2 fields, in order: + + * ``timestamp``: a non-nullable ``Timestamp(time_unit, "UTC")``, where ``time_unit`` is any Arrow ``TimeUnit`` (s, ms, us or ns). Review Comment: Hey all! To move this forward, I: 1. Changed the wording from `... non-nullable ...` to `... preferably non-nullable ...` 2. Added a new note section outlining the expected semantics when the inner fields do have a validity buffer. This should be enough to avoid implementations drifting and having their own freestyle interpretation. 3. Added a recommendation that implementations _should_ (but aren't required to) normalize the type by dropping the inner validity buffers and only keeping the top-level one. This should make it easier for compute kernels to interpret this type correctly without having to implement the logic themselves. Let me know if this is enough to resolve this thread! https://github.com/apache/arrow/pull/48002/commits/b0d9be3be93098e7932236b1e5aa921d3d1c8469 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
