> Would there be any reason to have (or not have) a canonical LogicalType
for these in Parquet as well?

I think it would be appropriate to add this to Parquet as well. I assume
there's a different process / mailing list for that?

> our goal here should be to standardize existing practice, not come up
with a novel representation, IMHO.

BigQuery is using 128-bits, which is why I went this proposal.

Trino is using 96-bits (
https://github.com/trinodb/trino/blob/eef66628759d7244c176f62be45f3d9f0e5a1a5d/core/trino-spi/src/main/java/io/trino/spi/type/LongTimestampType.java)
but doesn't seem to me that would be much more efficient compared to 128.

*  •  **Tim Sweña (Swast)*
*  •  *Team Lead, BigQuery DataFrames
*  •  *Google Cloud Platform
*  •  *Chicago, IL, USA


On Wed, Nov 19, 2025 at 3:35 AM Antoine Pitrou <[email protected]> wrote:

>
> I don't have a personal opinion on which representation is technical
> better, but our goal here should be to standardize existing practice,
> not come up with a novel representation, IMHO.
>
> Regards
>
> Antoine.
>
>
> Le 18/11/2025 à 23:45, Felipe Oliveira Carvalho a écrit :
> > One reason to avoid 128-bit integers is the requirement for 128-bit
> > operations that it creates. Many high-resolution time representations
> split
> > the value in two integers in a way that is useful for many time-related
> > operations.
> >
> > The picosecond resolution can be achieved by splitting into a (seconds:
> > i64, picoseconds: i64) pair where the number of picoseconds in a day can
> > fit in 53 bits and the number of seconds can represent much more than 10K
> > years in number of seconds.
> >
> > This removes the need for a128-bit division by 86400 to do anything
> > interesting with the picoseconds timestamp. This layout could be a
> > Canonical Extension Type proposal with the seconds timestamp fields being
> > one of the existing timestamp types allowing for very cheap casts from
> the
> > extension type to the timestamp with the precision in seconds.
> >
> > --
> > Felipe
> >
> > On Tue, Nov 18, 2025 at 6:22 PM Curt Hagenlocher <[email protected]>
> > wrote:
> >
> >> For both Duration and Timestamp, this would require adding a new field
> >> to the FlatBuffers spec. That should be okay, right?
> >>
> >> A 128-bit timestamp would be useful at a nanosecond scale as well;
> >> there are databases like Snowflake which support a precision and scale
> >> for timestamps that force either truncation of precision or clipping
> >> of range when representing as Arrow.
> >>
> >> Would there be any reason to have (or not have) a canonical
> >> LogicalType for these in Parquet as well?
> >>
> >> On Fri, Nov 7, 2025 at 1:29 PM Tim Swena <[email protected]>
> wrote:
> >>>
> >>> Hello,
> >>>
> >>> Per the process described at
> >>>
> >>
> https://arrow.apache.org/docs/format/Changing.html#discussion-and-voting-process
> >>> I am starting a discussion thread for the following spec change
> proposal:
> >>>
> >>>
> >>>     1.
> >>>
> >>>     Add a new time unit: PICOSECOND, which is unsupported in the
> existing
> >>>     64-bit timestamp-related types.
> >>>     2.
> >>>
> >>>     Add support for bitWidth=128 to the timestamp data type, which
> >> supports
> >>>     all units, including PICOSECOND.
> >>>     3.
> >>>
> >>>     Add support for bitWidth=128 to the duration data type, which
> supports
> >>>     all units, including PICOSECOND.
> >>>
> >>> This is motivated by some currently experimental changes in BigQuery to
> >>> support picosecond precision timestamps (source
> >>> <
> >>
> https://docs.cloud.google.com/bigquery/docs/reference/storage/rpc/google.cloud.bigquery.storage.v1?content_ref=read%20api%20will%20return%20full%20precision%20picosecond%20value%20the%20value%20will%20be%20encoded%20as%20a%20string%20which%20conforms%20to%20iso%208601%20format#picostimestampprecision
> >>> ),
> >>> but from what I can tell such timestamps already have some support in
> IBM
> >>> Db2 (source
> >>> <
> >>
> https://www.ibm.com/docs/en/db2-for-zos/13.0.0?topic=jdbc-dbtimestamp-class&content_ref=the+com+ibm+db2+jcc+dbtimestamp+class+can+be+used+to+create+timestamp+objects+with+a+precision+of+up+to+picoseconds+and+time+zone+information
> >>> )
> >>> and Trino (source
> >>> <
> >>
> https://trino.io/docs/current/language/types.html?content_ref=heading+calendar+date+and+time+of+day+without+a+time+zone+with+pdigits+of+precision+for+the+fraction+of+seconds+a+precision+of+up+to+12+picoseconds+is+supported
> >>> ).
> >>> Note that reference implementation(s) are still very much a
> >>> work-in-progress (https://github.com/apache/arrow/pull/48018 for a
> >> start in
> >>> C++), but I figured it would be useful to kick off the conversation
> >> before
> >>> diving in too much further into implementation.
> >>>
> >>> Inspired by other discussions, I've created a draft of a more formal
> RFC
> >>> document here: Arrow-RFC: timestamp128 and duration128 data types with
> >>> support for picosecond units
> >>> <
> >>
> https://docs.google.com/document/d/1-S0qvYTIEGlLnNkkgyWSHfnIvU4xpFqDQuMNTojaj9A/edit?tab=t.0#heading=h.as1aixu509k7
> >>>
> >>>
> >>> *  •  **Tim Sweña (Swast)*
> >>> *  •  *Team Lead, BigQuery DataFrames
> >>> *  •  *Google Cloud Platform
> >>> *  •  *Chicago, IL, USA
> >>
> >
>
>

Reply via email to