Ah, got it. Sorry for the noise!

-dewey

On Sat, Jun 21, 2025 at 7:35 PM David Li <lidav...@apache.org> wrote:

> MonthDayNano in Arrow uses calendar days, but as noted Iceberg's proposed
> DAY_TIME interval is a duration in Arrow parlance. So if you add "1 day" in
> Iceberg (which is actually definitionally exactly 86400 seconds) to a
> timestamp right before a DST transition, you will be off by an hour
> compared to if you did the same in Arrow.
>
> On Sun, Jun 22, 2025, at 04:54, Dewey Dunnington wrote:
> > I may be misunderstanding the MonthDayNano type, but I think it gives a
> > range of roughly +/- INT32_MAX days (5.8 million years?) at nanosecond
> > precision without considering the months component?
> >
> > On Sat, Jun 21, 2025 at 12:58 PM David Li <lidav...@apache.org> wrote:
> >
> >> Hello Arrow devs,
> >>
> >> There's an ongoing discussion in Iceberg [1] and Parquet [2] to define
> and
> >> standardize new interval types. Of course, it would be ideal if these
> new
> >> types had a canonical representation in Arrow. While YEAR_MONTH is the
> same
> >> as Arrow's month interval, however, DAY_TIME is actually a 128-bit
> >> nanosecond duration and hence I don't think it can be represented by
> >> MonthDayNano or the duration type.
> >>
> >> It might be interesting to consider whether there's some other way to
> >> encode this type in Arrow (or if an extension type should be
> considered),
> >> or find a way to define it that would more easily map onto an existing
> type
> >> (while still meeting the Iceberg goal of being ANSI SQL compatible,
> which
> >> apparently requires +/- 10000 years of range).
> >>
> >> [1]: https://lists.apache.org/thread/65sxmjcfpvbp262dh73v5m4zjdgzt7j1
> >> [2]: https://github.com/apache/parquet-format/pull/496
> >>
> >> -David
>

Reply via email to