[jira] [Commented] (ARROW-15747) [Python] Support C stream interface of single arrays

Antoine Pitrou (Jira) Thu, 09 Jun 2022 01:41:04 -0700


    [ 
https://issues.apache.org/jira/browse/ARROW-15747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17552043#comment-17552043
 ]


Antoine Pitrou commented on ARROW-15747:
----------------------------------------

Use case 2 (heterogenous chunking) is easily addressed by redoing the chunking. 
That's what Arrow C++ does when you want to get a RecordBatchReader out of a 
Table. I agree with use cases 1 and 3.

There are two ways this could be added to Arrow C++ (and PyArrow):

# return a {{RecordBatchReader}} that would read batches of a single column
# add a facility like {{RecordBatchReader}} but on Arrays

The first approach is easier and perhaps less elegant. Also, the second 
approach would allow to implement an export function, which would be a bit 
clunky under the first approach.


> [Python] Support C stream interface of single arrays
> ----------------------------------------------------
>
>                 Key: ARROW-15747
>                 URL: https://issues.apache.org/jira/browse/ARROW-15747
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++, Python
>            Reporter: Jorge Leitão
>            Priority: Major
>
> It seems that the C stream interface in pyarrow currently requires the array 
> to be a StructArray.
> I do not see this constraint in the spec 
> (https://arrow.apache.org/docs/format/CStreamInterface.html).
> The error I get when I pass an Int32Array to it (declared on the schema):
> {code:java}
> Invalid: Cannot import schema: ArrowSchema describes non-struct type int32
> {code}
> It would be nice to support everything, like the C data interface.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (ARROW-15747) [Python] Support C stream interface of single arrays

Reply via email to