[
https://issues.apache.org/jira/browse/ARROW-15747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17551943#comment-17551943
]
Jorge Leitão commented on ARROW-15747:
--------------------------------------
Any array can be wrapped on a StructArray, so technically we just have to
remember to wrap it on the struct :)
I see three main reasons to support the simple array:
1. an implementation may not support StructArray over the c data interface but
still support simpler types.
2. when we want to stream a "Table" where the chunks are not row-aligned, e.g.
c1: [10 rows][20 rows][20 rows]
c2: [20 rows][30 rows]
it is cleaner to have an iterator of chunks of those types, instead of wrapping
and unwrapping each of them on StructArray.
3. StructArrays with sliced children over the c data interface are tricky. If
we wrap a sliced array over struct, and pass it over the c stream interface,
some implementations may not support it.
> [Python] Support C stream interface of single arrays
> ----------------------------------------------------
>
> Key: ARROW-15747
> URL: https://issues.apache.org/jira/browse/ARROW-15747
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Python
> Reporter: Jorge Leitão
> Priority: Major
>
> It seems that the C stream interface in pyarrow currently requires the array
> to be a StructArray.
> I do not see this constraint in the spec
> (https://arrow.apache.org/docs/format/CStreamInterface.html).
> The error I get when I pass an Int32Array to it (declared on the schema):
> {code:java}
> Invalid: Cannot import schema: ArrowSchema describes non-struct type int32
> {code}
> It would be nice to support everything, like the C data interface.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)