mosalx opened a new issue, #35358:
URL: https://github.com/apache/arrow/issues/35358
### Describe the bug, including details regarding any error messages,
version, and platform.
# Summary
`ListArray.flatten` and `FixedSizeListArray.flatten` return an array of its
values, sliced with offsets of the parent array.
When the same array is wrapped into a `ChunkedArray`, and the array type is
`StructArray` the output is a chunked array of the parent array. In other
words, `ChunkedArray.flatten` returns the array itself, instead of its values
# Environment
Observed on Windows 10
python=3.11.2
pyarrow=11.0.0
# Details
```python
import pyarrow as pa
array = pa.array([
[{'a': 5}, {'a': 6}],
[{'a': 7}]
])
# same array wrapped in a ChunkedArray
array_chunked = pa.chunked_array([array])
```
Now let's flatten each of these two arrays.
Output of `array.flatten()`
```python
<pyarrow.lib.StructArray object at 0x000001DAA018D8A0>
-- is_valid: all not null
-- child 0 type: int64
[
5,
6,
7
]
```
Output of `array_chunked.flatten()`
```python
[<pyarrow.lib.ChunkedArray object at 0x000001DAAE852A20>
[
[
-- is_valid: all not null
-- child 0 type: int64
[
5,
6
],
-- is_valid: all not null
-- child 0 type: int64
[
7
]
]
]]
```
In other words, the first chunk of the flattened chunked array is equal to
the original array, which should not happen
```python
assert not array.equals(array_chunked.flatten()[0].chunk(0)) #
AssertionError
```
This issue is observed with `FixedSizeListArray` as well
```python
array = pa.array([[{'a': 5}], [{'a': 7}]],
type=pa.list_(pa.struct([('a', pa.int32())]), list_size=1))
array_chunked = pa.chunked_array([array])
assert not arr.equals(carr.flatten()[0].chunk(0)) # AssertionError
```
Expected behavior:
Flattened chunked array is expected to be a chunked array wrapping values of
the original array, which is a `StructArray`
### Component(s)
Python
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]