Jefffrey opened a new issue, #9068: URL: https://github.com/apache/arrow-rs/issues/9068
**Which part is this question about** <!-- Is it code base, library api, documentation or some other part? --> https://github.com/apache/arrow-rs/blob/814ee4227c01fce478bdd3594dd156250286b46e/arrow-array/src/array/mod.rs#L176-L191 **Describe your question** <!-- A clear and concise description of what the question is. --> What exactly is this supposed to represent, and what is the use case of this function? If we consider a simple case, it might seem obvious, from the docstring: https://github.com/apache/arrow-rs/blob/814ee4227c01fce478bdd3594dd156250286b46e/arrow-array/src/array/mod.rs#L185-L189 - If we slice an array by an offset, calling `offset` on the sliced array returns the offset; simple! But if we look at primitive arrays, we don't even support this: https://github.com/apache/arrow-rs/blob/814ee4227c01fce478bdd3594dd156250286b46e/arrow-array/src/array/primitive_array.rs#L1229-L1231 So regardless of whether a primitive array gets sliced, it will always say the offset is 0. We might consider this a bug to be fixed, but if we think about it more, **which** offset do we return? Technically a primitive array has two buffers: the values and null buffer. If we use `slice` this is trivial since we use the same offset for both. However, if we manually construct a primitive array by passing in the values and null buffers, but we pre-slice these by a different amount each, what does the offset become? ```rust let values: ScalarBuffer<i64> = vec![1, 2, 3].into(); let nulls: NullBuffer = vec![true, true, true].into(); let values = values.slice(1, 1); let nulls = nulls.slice(2, 1); let array = Int64Array::new(values, Some(nulls)); ``` - What should the offset be? We could sidestep this by just defining an offset to only be valid if preceded by a slice (so pre-slicing and then creating an array from the buffers is not considered slicing) but I feel this would be inconsistent. **Additional context** <!-- Add any other context about the problem here. --> **Arrays that implement `offset`** Run array https://github.com/apache/arrow-rs/blob/814ee4227c01fce478bdd3594dd156250286b46e/arrow-array/src/array/run_array.rs#L297-L299 Dictionary array https://github.com/apache/arrow-rs/blob/814ee4227c01fce478bdd3594dd156250286b46e/arrow-array/src/array/dictionary_array.rs#L734-L736 - Just delegates to key array Boolean array https://github.com/apache/arrow-rs/blob/814ee4227c01fce478bdd3594dd156250286b46e/arrow-array/src/array/boolean_array.rs#L325-L327 **Arrays that always leave `offset` as 0** Byte array https://github.com/apache/arrow-rs/blob/814ee4227c01fce478bdd3594dd156250286b46e/arrow-array/src/array/byte_array.rs#L502-L504 List view array https://github.com/apache/arrow-rs/blob/814ee4227c01fce478bdd3594dd156250286b46e/arrow-array/src/array/list_view_array.rs#L456-L458 Map array https://github.com/apache/arrow-rs/blob/814ee4227c01fce478bdd3594dd156250286b46e/arrow-array/src/array/map_array.rs#L401-L403 List array https://github.com/apache/arrow-rs/blob/814ee4227c01fce478bdd3594dd156250286b46e/arrow-array/src/array/list_array.rs#L565-L567 Struct array https://github.com/apache/arrow-rs/blob/814ee4227c01fce478bdd3594dd156250286b46e/arrow-array/src/array/struct_array.rs#L440-L442 Fixed size binary array https://github.com/apache/arrow-rs/blob/814ee4227c01fce478bdd3594dd156250286b46e/arrow-array/src/array/fixed_size_binary_array.rs#L641-L643 Null array https://github.com/apache/arrow-rs/blob/814ee4227c01fce478bdd3594dd156250286b46e/arrow-array/src/array/null_array.rs#L108-L110 Fixed size list array https://github.com/apache/arrow-rs/blob/814ee4227c01fce478bdd3594dd156250286b46e/arrow-array/src/array/fixed_size_list_array.rs#L501-L503 Union array https://github.com/apache/arrow-rs/blob/814ee4227c01fce478bdd3594dd156250286b46e/arrow-array/src/array/union_array.rs#L781-L783 Byte view array https://github.com/apache/arrow-rs/blob/814ee4227c01fce478bdd3594dd156250286b46e/arrow-array/src/array/byte_view_array.rs#L895-L897 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
