[
https://issues.apache.org/jira/browse/ARROW-12099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312363#comment-17312363
]
Joris Van den Bossche commented on ARROW-12099:
-----------------------------------------------
I _assume_ your example starts with a table like the following?
{code}
In [100]: table = pa.table({'a': [0, 1, 2], 'b': [[4, 5, 6]]*3})
In [101]: table.to_pandas()
Out[101]:
a b
0 0 [4, 5, 6]
1 1 [4, 5, 6]
2 2 [4, 5, 6]
{code}
The function I wrote above to explode a list column in such a table gives:
{code}
In [102]: explode_table(table, 'b').to_pandas()
Out[102]:
a b
0 0 4
1 0 5
2 0 6
3 1 4
4 1 5
5 1 6
6 2 4
7 2 5
8 2 6
{code}
which seems the same output as you showed above?
> [Python] Explode array column
> -----------------------------
>
> Key: ARROW-12099
> URL: https://issues.apache.org/jira/browse/ARROW-12099
> Project: Apache Arrow
> Issue Type: New Feature
> Components: Python
> Reporter: Malthe Borch
> Priority: Major
>
> In Apache Spark,
> [explode|https://spark.apache.org/docs/latest/api/sql/index.html#explode]
> separates the elements of an array column (or expression) into multiple row.
> Note that each explode works at the top-level only (not recursively).
> This would also work with the existing
> [flatten|https://arrow.apache.org/docs/python/generated/pyarrow.Table.html#pyarrow.Table.flatten]
> method to allow fully unnesting a
> [pyarrow.StructArray|https://arrow.apache.org/docs/python/generated/pyarrow.StructArray.html#pyarrow-structarray].
--
This message was sent by Atlassian Jira
(v8.3.4#803005)