[
https://issues.apache.org/jira/browse/ARROW-7363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133993#comment-17133993
]
Daniel Nugent commented on ARROW-7363:
--------------------------------------
It seems like there should be *some* way to get to a contiguous buffer of data
from a chunkedarray even if it involves copying. I'm looking at something right
now where I want to try and produce Parquet RowGroups of identical length to an
input dataset and it'd be nice to be able to handle this in Arrow before
passing it off to the analysis functions I'm using.
Could it just be called {{unchunk}} or something? (Maybe a peanut butter pun
would be good: {{creamy}})
> [Python] flatten() doesn't work on ChunkedArray
> -----------------------------------------------
>
> Key: ARROW-7363
> URL: https://issues.apache.org/jira/browse/ARROW-7363
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 0.15.1
> Reporter: marc abboud
> Priority: Major
>
> Flatten() doesn't work on ChunkedArray. It returns only the ChunkedArray in a
> list without flattening anything.
> {code:java}
> // code placeholder
> aa = pa.array([[1],[2]])
> bb = pa.chunked_array([aa,aa])
>
> bb.flatten()
> Out[15]:
> [<pyarrow.lib.ChunkedArray object> [ [ [ 1 ], [ 2 ] ], [ [ 1 ], [ 2 ] ] ]]
> Expected:
> [ <pyarrow.lib.Array object> [ 1, 2 ], <pyarrow.lib.Array object> [ 1, 2 ] ]
> {code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)