felipecrv commented on code in PR #48002:
URL: https://github.com/apache/arrow/pull/48002#discussion_r2503778940


##########
docs/source/format/CanonicalExtensions.rst:
##########
@@ -483,6 +483,28 @@ binary values look like.
 
 .. _variant_primitive_type_mapping:
 
+Timestamp With Offset
+=============
+This type represents a timestamp column that stores potentially different 
timezone offsets per value. The timestamp is stored in UTC alongside the 
original timezone offset in minutes.
+
+* Extension name: ``arrow.timestamp_with_offset``.
+
+* The storage type of the extension is a ``Struct`` with 2 fields, in order:
+
+  * ``timestamp``: a non-nullable ``Timestamp(time_unit, "UTC")``, where 
``time_unit`` is any Arrow ``TimeUnit`` (s, ms, us or ns).

Review Comment:
   @jorisvandenbossche now I understand what you meant.
   
   One complexity here is that some compute kernels might want to look at just 
the UTC timestamp field because they only care about the instant, so we should 
at least warn/recommend what to do when two bitmaps exist. If we make the spec 
require that the top-level bitmap can be more selective than the inner bitmap, 
kernels looking at just the timestamp would be allowed to grab the top-level 
bitmap and apply it to the processing and the output.
   
   @lidavidm I think top-level bitmaps is the best, but inevitably someone will 
have to make a decision on what to do when more than one bitmap exists and the 
spec having recommendations could prevent divergence between implementations.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to