mikelui commented on issue #32439: URL: https://github.com/apache/arrow/issues/32439#issuecomment-1688508294
I think the root cause is here: * https://github.com/apache/arrow/blob/main/cpp/src/arrow/util/converter.h#L368 * https://github.com/apache/arrow/blob/main/cpp/src/arrow/util/converter.h#L377 * https://github.com/apache/arrow/blob/main/cpp/src/arrow/util/converter.h#L86 * https://github.com/apache/arrow/blob/main/cpp/src/arrow/array/array_base.cc#L272 * https://github.com/apache/arrow/blob/main/cpp/src/arrow/array/data.cc#L139 * https://github.com/apache/arrow/blob/main/cpp/src/arrow/array/util.cc#L317 * https://github.com/apache/arrow/blob/main/cpp/src/arrow/array/util.cc#L65 In the blanket implementation of `Array::Slice`, only the `offset` and `length` of the underying `ArrayData` is modified. For e.g. `ListArray`s, this means that when looking at its length and traversing its internal value offsets (its `ArrayData`) things will look as expected. But when we check the underlying `child_data` (the actual `values` `ArrayData`) we won't have the same sliced view. I did a janky sanity test to make `Array::Slice` virtual and override in `ListArray` to slice the offset/length of the `child_data`, which seemed to work. But that's probably not what we want as an actual fix. I'm guessing this will be the same for all nested types. Hence, we get the validation error which probably isn't expecting `Slice`s that have extra data. I'm not sure what the best path forward is here. It's not clear to me: * whether the `Slice` is actually what we want (do we want to keep the extra data around?) * how to support "deep slicing" for all the other less common types I don't have experience with (Unions, Dictionary, etc) @westonpace can you advise if this sounds like the right place, and what next steps should look like? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
