[
https://issues.apache.org/jira/browse/ARROW-12099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17310753#comment-17310753
]
Malthe Borch commented on ARROW-12099:
--------------------------------------
[~jorisvandenbossche] in Spark, explode does not "zip" arrays in different
columns actually – it just copies the entire row for each value in the exploded
column (which is originally an array) such that if the array had N values,
there would now be N rows in place of the original row. Rinse and repeat for
all rows in the original dataframe.
> [Python] Explode array column
> -----------------------------
>
> Key: ARROW-12099
> URL: https://issues.apache.org/jira/browse/ARROW-12099
> Project: Apache Arrow
> Issue Type: New Feature
> Components: Python
> Reporter: Malthe Borch
> Priority: Major
>
> In Apache Spark,
> [explode|https://spark.apache.org/docs/latest/api/sql/index.html#explode]
> separates the elements of an array column (or expression) into multiple row.
> Note that each explode works at the top-level only (not recursively).
> This would also work with the existing
> [flatten|https://arrow.apache.org/docs/python/generated/pyarrow.Table.html#pyarrow.Table.flatten]
> method to allow fully unnesting a
> [pyarrow.StructArray|https://arrow.apache.org/docs/python/generated/pyarrow.StructArray.html#pyarrow-structarray].
--
This message was sent by Atlassian Jira
(v8.3.4#803005)