jorisvandenbossche commented on issue #38643: URL: https://github.com/apache/arrow/issues/38643#issuecomment-1803767114
Exploding is currently something that isn't provided out of the box, see https://github.com/apache/arrow/issues/27923 for an issue on this topic and some example workarounds (using existing pyarrow compute functions to achieve the same effect). Once you exploded the list over multiple rows, you can flatten the table with the struct type into a table with a top-level column for each struct field with the `flatten()` method: ``` >>> table = pa.table({"id": [1, 1, 2], "events": [{"tm": pd.Timestamp("2012-01-01"), "sum": 10}] * 3}) >>> table.to_pandas() id events 0 1 {'sum': 10, 'tm': 2012-01-01 00:00:00} 1 1 {'sum': 10, 'tm': 2012-01-01 00:00:00} 2 2 {'sum': 10, 'tm': 2012-01-01 00:00:00} >>> table.flatten().to_pandas() id events.sum events.tm 0 1 10 2012-01-01 1 1 10 2012-01-01 2 2 10 2012-01-01 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
