[
https://issues.apache.org/jira/browse/ARROW-12099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17310781#comment-17310781
]
Malthe Borch commented on ARROW-12099:
--------------------------------------
So Spark cannot actually explode (or "generate") more than one expression per
select statement (that is simply not allowed), but I suppose sometimes you want
to "zip" the arrays (like you have shown) and other times you would want to
form the cartesian product from them:
{code:java}
spark-sql> SELECT a, explode(b) FROM (SELECT explode(sequence(0, 2)) a,
sequence(4, 6) b);
0 4
0 5
0 6
1 4
1 5
1 6
2 4
2 5
2 6
Time taken: 0.187 seconds, Fetched 9 row(s)
{code}
In your {{explode_table}} function, what role does the {{column}} parameter
have exactly? Why does it touch the {{'a'}} column if you mention {{'b' - ?}}
> [Python] Explode array column
> -----------------------------
>
> Key: ARROW-12099
> URL: https://issues.apache.org/jira/browse/ARROW-12099
> Project: Apache Arrow
> Issue Type: New Feature
> Components: Python
> Reporter: Malthe Borch
> Priority: Major
>
> In Apache Spark,
> [explode|https://spark.apache.org/docs/latest/api/sql/index.html#explode]
> separates the elements of an array column (or expression) into multiple row.
> Note that each explode works at the top-level only (not recursively).
> This would also work with the existing
> [flatten|https://arrow.apache.org/docs/python/generated/pyarrow.Table.html#pyarrow.Table.flatten]
> method to allow fully unnesting a
> [pyarrow.StructArray|https://arrow.apache.org/docs/python/generated/pyarrow.StructArray.html#pyarrow-structarray].
--
This message was sent by Atlassian Jira
(v8.3.4#803005)