Hi Adam, you are referring to the issue you raised on the Arrow repo [1] that turned into a good discussion about FixedSizeList and the current conversion to Parquet.
Please correct me if I am wrong, but the outcome of the discussion was that the conversion is still pretty fast (much faster than commonly used serialization formats for tensors) though not as fast compared to other primitives in Apache Arrow. My opinion is that the discussion on this topic can be opened up separately in connection to optimising conversion between FixedSizeList as an Arrow format to Parquet, if there is still a need to do so. For this canonical extension type I would say it is an implementation detail and you mention a way to handle that with Parquet in the issue mentioned [2]. I do not think there should be any issues in the conversion to Pandas. The conversion to numpy is not expensive and I would think the conversion to pandas should be the same. See PyArrow illustrative implementation [3]. [1]: https://github.com/apache/arrow/issues/34510 [2]: https://github.com/apache/arrow/issues/34510#issuecomment-1464463384 [3]: https://github.com/apache/arrow/pull/33948/files#diff-efc1a41cdf04b6ec96d822dbec1f1993e0bbd17050b1b5f1275c8e3443a38828 All well, Alenka On Fri, Mar 10, 2023 at 11:32 PM Adam Lippai <a...@rigo.sk> wrote: > Since the specification explicitly mentions FixedSizeList, but the current > conversion to/from parquet is expensive compared to doubles and other > primitives (the nested type needs repetition and definition levels) should > we discuss what’s the recommendation when integrating with other non-arrow > systems or is that an implementation detail only? (Pandas, parquet) > > Best regards, > Adam Lippai > > On Wed, Mar 8, 2023 at 1:13 AM Alenka Frim <ale...@voltrondata.com > .invalid> > wrote: > > > > > > > Just one comment, though: since we also define a separate "Tensor" IPC > > > structure in Arrow, maybe we should state the relationship somewhere in > > the > > > documentation? (Even if the answer is "no relationship".) > > > > > > > Agree David, thanks for bringing it up. > > > > I will add the information about "no relationship" to the Tensor IPC > > structure into the spec and will also keep in mind to add it to the > > documentation that follows the implementations. > > >