I agree, it is what I would have proposed for the interval type if there wasn't an interval type in Arrow already. I think FixedSizeList has for better or worse solved a lot of the problems that a struct type would be used for (e.g. coordinates)
Cheers, Micah On Tue, Aug 31, 2021 at 8:27 AM Wes McKinney <wesmck...@gmail.com> wrote: > I do still think that having a "packed C struct" type would be a > useful thing, but thus far no one has needed it enough to develop > something in the columnar format specification. > > On Tue, Aug 31, 2021 at 1:33 AM Micah Kornfield <emkornfi...@gmail.com> > wrote: > > > > Hi Jorge, > > Are there places in the docs that you think this would simplify? > > There is an old JIRA [1] about introducing a c-struct type that I > > think aligns with this observation [1] > > > > -Micah > > > > [1] https://issues.apache.org/jira/browse/ARROW-1790 > > > > On Mon, Aug 30, 2021 at 2:57 PM Jorge Cardoso Leitão > > <jorgecarlei...@gmail.com> wrote: > > > > > > Hi, > > > > > > Just came across this curiosity that IMO may help us to design physical > > > types in the future. > > > > > > Not sure if this was mentioned before, but it seems to me that > > > `DaysMilliseconds` and `MonthDayNano` belong to a broader class of > physical > > > types "typed tuples" in that they are constructed by defining the tuple > > > `(t_1,t_2,...,t_N)` where t_i (e.g. int32) is representable in memory > for a > > > given endianess, and each element of the array is written to the buffer > > > back to back as `<t1 in endianess><t2 in endianess>...<tN in > endianess>`. > > > > > > Primitive arrays such as e.g. `Int32Array` are the extreme case where > the > > > tuple has a single entry (t1,), which leads to `<int32 in endianess>`. > The > > > others are: > > > * DaysMilliseconds = (int32, int32) > > > * MonthDayNano = (int32, int32, int64) > > > > > > In principle, we could re-write the in-memory layout page in these > terms > > > that places all the types above in the same "bucket". > > > > > > Best, > > > Jorge >