[
https://issues.apache.org/jira/browse/ARROW-4757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16783141#comment-16783141
]
Philipp Moritz commented on ARROW-4757:
---------------------------------------
One possible way to solve this is to make ChunkedArray a first class citizen,
i.e. make it a subclass of Array and allow it to participate in IPC. Then the
UnionArray could just have a ChunkedArray as a child to solve the above issue.
> Nested chunked array support
> ----------------------------
>
> Key: ARROW-4757
> URL: https://issues.apache.org/jira/browse/ARROW-4757
> Project: Apache Arrow
> Issue Type: Improvement
> Reporter: Philipp Moritz
> Priority: Major
>
> Dear all,
> I'm currently trying to lift the 2GB limit on the python serialization. For
> this, I implemented a chunked union builder to split the array into smaller
> arrays.
> However, some of the children of the union array can be ListArrays, which can
> themselves contain UnionArrays which can contain ListArrays etc. I'm at a bit
> of a loss how to handle this. In principle I'd like to chunk the children
> too. However, currently UnionArrays can only have children of type Array, and
> there is no way to treat a chunked array (which is a vector of Arrays) as an
> Array to store it as a child of a UnionArray. Any ideas how to best support
> this use case?
> -- Philipp.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)