Joris Van den Bossche created ARROW-7066: --------------------------------------------
Summary: [Python] support returning ChunkedArray from __arrow_array__ ? Key: ARROW-7066 URL: https://issues.apache.org/jira/browse/ARROW-7066 Project: Apache Arrow Issue Type: Improvement Components: Python Reporter: Joris Van den Bossche Fix For: 1.0.0 The {{\_\_arrow_array\_\_}} protocol was added so that custom objects can define how they should be converted to a pyarrow Array (similar to numpy's {{\_\_array\_\_}}). This is then also used to support converting pandas DataFrames with columns using pandas' ExtensionArrays to a pyarrow Table (if the pandas ExtensionArray, such as nullable integer type, implements this {{\_\_arrow_array\_\_}} method). This last use case could also be useful for fletcher (https://github.com/xhochy/fletcher/, a package that implements pandas ExtensionArrays that wrap pyarrow arrays, so they can be stored as is in a pandas DataFrame). However, fletcher stores ChunkedArrays in ExtensionArry / the columns of a pandas DataFrame (to have a better mapping with a Table, where the columns also consist of chunked arrays). While we currently require that the return value of {{\_\_arrow_array\_\_}} is a pyarrow.Array. So I was wondering: could we relax this constraint and also allow ChunkedArray as return value? However, this protocol is currently called in the {{pa.array(..)}} function, which probably should keep returning an Array (and not ChunkedArray in certain cases). cc [~uwe] -- This message was sent by Atlassian Jira (v8.3.4#803005)