mikelui commented on issue #32439:
URL: https://github.com/apache/arrow/issues/32439#issuecomment-1688508294

   I think the root cause is here:
   
   * 
https://github.com/apache/arrow/blob/main/cpp/src/arrow/util/converter.h#L368
   * 
https://github.com/apache/arrow/blob/main/cpp/src/arrow/util/converter.h#L377
   * 
https://github.com/apache/arrow/blob/main/cpp/src/arrow/util/converter.h#L86
   * 
https://github.com/apache/arrow/blob/main/cpp/src/arrow/array/array_base.cc#L272
   * https://github.com/apache/arrow/blob/main/cpp/src/arrow/array/data.cc#L139
   * https://github.com/apache/arrow/blob/main/cpp/src/arrow/array/util.cc#L317
   * https://github.com/apache/arrow/blob/main/cpp/src/arrow/array/util.cc#L65
   
   In the blanket implementation of `Array::Slice`, only the `offset` and 
`length` of the underying `ArrayData` is modified.
   
   For e.g. `ListArray`s, this means that when looking at its length and 
traversing its internal value offsets (its `ArrayData`) things will look as 
expected. But when we check the underlying `child_data` (the actual `values` 
`ArrayData`) we won't have the same sliced view.
   
   I did a janky sanity test to make `Array::Slice` virtual and override in 
`ListArray` to slice the offset/length of the `child_data`, which seemed to 
work. But that's probably not what we want as an actual fix.
   
   I'm guessing this will be the same for all nested types. Hence, we get the 
validation error which probably isn't expecting `Slice`s that have extra data. 
   
   I'm not sure what the best path forward is here. It's not clear to me:
   * whether the `Slice` is actually what we want (do we want to keep the extra 
data around?)
   * how to support "deep slicing" for all the other less common types I don't 
have experience with (Unions, Dictionary, etc)
   
   @westonpace can you advise if this sounds like the right place, and what 
next steps should look like?
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to