sergun commented on issue #38643:
URL: https://github.com/apache/arrow/issues/38643#issuecomment-1808896791
> Exploding is currently something that isn't provided out of the box, see
#27923 for an issue on this topic and some example workarounds (using existing
pyarrow compute functions to achieve the same effect).
>
> Once you exploded the list over multiple rows, you can flatten the table
with the struct type into a table with a top-level column for each struct field
with the `flatten()` method:
>
> ```
> >>> table = pa.table({"id": [1, 1, 2], "events": [{"tm":
pd.Timestamp("2012-01-01"), "sum": 10}] * 3})
> >>> table.to_pandas()
> id events
> 0 1 {'sum': 10, 'tm': 2012-01-01 00:00:00}
> 1 1 {'sum': 10, 'tm': 2012-01-01 00:00:00}
> 2 2 {'sum': 10, 'tm': 2012-01-01 00:00:00}
>
> >>> table.flatten().to_pandas()
> id events.sum events.tm
> 0 1 10 2012-01-01
> 1 1 10 2012-01-01
> 2 2 10 2012-01-01
> ```
Thx a lot!
#27923 + pa.table.flatten() solves the issue
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]