> Is this buffer lengths buffer only present if the array type is Utf8View?

IIUC, the proposal would add the buffer lengths buffer for all types if the
schema's
flags include ARROW_FLAG_BUFFER_LENGTHS. I do find it appealing to avoid
the special case and that `n_buffers` would continue to be consistent with
IPC.

On Thu, Oct 26, 2023 at 1:35 PM Weston Pace <weston.p...@gmail.com> wrote:

> Is this buffer lengths buffer only present if the array type is Utf8View?
> Or are you suggesting that other types might want to adopt this as well?
>
> On Thu, Oct 26, 2023 at 10:00 AM Dewey Dunnington
> <de...@voltrondata.com.invalid> wrote:
>
> > > I expect C code to not be much longer then this :-)
> >
> > nanoarrow's buffer-length-calculation and validation concepts are
> > (perhaps inadvisably) intertwined...even with both it is not that much
> > code (perhaps I was remembering how much time it took me to figure out
> > which 35 lines to write :-))
> >
> > > That sounds a bit hackish to me.
> >
> > Including only *some* buffer sizes in array->buffers[array->n_buffers]
> > special-cased for only two types (or altering the number of buffers
> > required by the IPC format vs. the number of buffers required by the C
> > Data interface) seem equally hackish to me (not that I'm opposed to
> > either necessarily...the alternatives really are very bad).
> >
> > > How can you *not* care about buffer sizes, if you for example need to
> > send the buffers over IPC?
> >
> > I think IPC is the *only* operation that requires that information?
> > (Other than perhaps copying to another device?) I don't think there's
> > any barrier to accessing the content of all the array elements but I
> > could be mistaken.
> >
> > On Thu, Oct 26, 2023 at 1:04 PM Antoine Pitrou <anto...@python.org>
> wrote:
> > >
> > >
> > > Le 26/10/2023 à 17:45, Dewey Dunnington a écrit :
> > > > The lack of buffer sizes is something that has come up for me a few
> > > > times working with nanoarrow (which dedicates a significant amount of
> > > > code to calculating buffer sizes, which it uses to do validation and
> > > > more efficient copying).
> > >
> > > By the way, this is a bit surprising since it's really 35 lines of code
> > > in C++ currently:
> > >
> > >
> >
> https://github.com/apache/arrow/blob/57f643c2cecca729109daae18c7a64f3a37e76e4/cpp/src/arrow/c/bridge.cc#L1721-L1754
> > >
> > > I expect C code to not be much longer then this :-)
> > >
> > > Regards
> > >
> > > Antoine.
> >
>

Reply via email to