[ 
https://issues.apache.org/jira/browse/ARROW-12099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17310781#comment-17310781
 ] 

Malthe Borch commented on ARROW-12099:
--------------------------------------

 

So Spark cannot actually explode (or "generate") more than one expression per 
select statement (that is simply not allowed), but I suppose sometimes you want 
to "zip" the arrays (like you have shown) and other times you would want to 
form the cartesian product from them:
{code:java}
spark-sql> SELECT a, explode(b) FROM (SELECT explode(sequence(0, 2)) a, 
sequence(4, 6) b);
0       4
0       5
0       6
1       4
1       5
1       6
2       4
2       5
2       6
Time taken: 0.187 seconds, Fetched 9 row(s)
{code}
In your {{explode_table}} function, what role does the {{column}} parameter 
have exactly? Why does it touch the {{'a'}} column if you mention {{'b' - ?}}

 

 

> [Python] Explode array column
> -----------------------------
>
>                 Key: ARROW-12099
>                 URL: https://issues.apache.org/jira/browse/ARROW-12099
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Python
>            Reporter: Malthe Borch
>            Priority: Major
>
> In Apache Spark, 
> [explode|https://spark.apache.org/docs/latest/api/sql/index.html#explode] 
> separates the elements of an array column (or expression) into multiple row.
> Note that each explode works at the top-level only (not recursively).
> This would also work with the existing 
> [flatten|https://arrow.apache.org/docs/python/generated/pyarrow.Table.html#pyarrow.Table.flatten]
>  method to allow fully unnesting a 
> [pyarrow.StructArray|https://arrow.apache.org/docs/python/generated/pyarrow.StructArray.html#pyarrow-structarray].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to