> Is this buffer lengths buffer only present if the array type is Utf8View?
IIUC, the proposal would add the buffer lengths buffer for all types if the schema's flags include ARROW_FLAG_BUFFER_LENGTHS. I do find it appealing to avoid the special case and that `n_buffers` would continue to be consistent with IPC. On Thu, Oct 26, 2023 at 1:35 PM Weston Pace <weston.p...@gmail.com> wrote: > Is this buffer lengths buffer only present if the array type is Utf8View? > Or are you suggesting that other types might want to adopt this as well? > > On Thu, Oct 26, 2023 at 10:00 AM Dewey Dunnington > <de...@voltrondata.com.invalid> wrote: > > > > I expect C code to not be much longer then this :-) > > > > nanoarrow's buffer-length-calculation and validation concepts are > > (perhaps inadvisably) intertwined...even with both it is not that much > > code (perhaps I was remembering how much time it took me to figure out > > which 35 lines to write :-)) > > > > > That sounds a bit hackish to me. > > > > Including only *some* buffer sizes in array->buffers[array->n_buffers] > > special-cased for only two types (or altering the number of buffers > > required by the IPC format vs. the number of buffers required by the C > > Data interface) seem equally hackish to me (not that I'm opposed to > > either necessarily...the alternatives really are very bad). > > > > > How can you *not* care about buffer sizes, if you for example need to > > send the buffers over IPC? > > > > I think IPC is the *only* operation that requires that information? > > (Other than perhaps copying to another device?) I don't think there's > > any barrier to accessing the content of all the array elements but I > > could be mistaken. > > > > On Thu, Oct 26, 2023 at 1:04 PM Antoine Pitrou <anto...@python.org> > wrote: > > > > > > > > > Le 26/10/2023 à 17:45, Dewey Dunnington a écrit : > > > > The lack of buffer sizes is something that has come up for me a few > > > > times working with nanoarrow (which dedicates a significant amount of > > > > code to calculating buffer sizes, which it uses to do validation and > > > > more efficient copying). > > > > > > By the way, this is a bit surprising since it's really 35 lines of code > > > in C++ currently: > > > > > > > > > https://github.com/apache/arrow/blob/57f643c2cecca729109daae18c7a64f3a37e76e4/cpp/src/arrow/c/bridge.cc#L1721-L1754 > > > > > > I expect C code to not be much longer then this :-) > > > > > > Regards > > > > > > Antoine. > > >