[
https://issues.apache.org/jira/browse/ARROW-8860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Antoine Pitrou reassigned ARROW-8860:
-------------------------------------
Assignee: Joris Van den Bossche
> [C++] IPC/Feather decompression broken for nested arrays
> --------------------------------------------------------
>
> Key: ARROW-8860
> URL: https://issues.apache.org/jira/browse/ARROW-8860
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++
> Reporter: Joris Van den Bossche
> Assignee: Joris Van den Bossche
> Priority: Critical
> Labels: pull-request-available
> Fix For: 1.0.0
>
> Time Spent: 2h 40m
> Remaining Estimate: 0h
>
> When writing a table with a Struct typed column, this is read back with
> garbage values when using compression (which is the default):
> {code:python}
> >>> table = pa.table({'col': pa.StructArray.from_arrays([[0, 1, 2], [1, 2,
> >>> 3]], names=["f1", "f2"])})
> # roundtrip through feather
> >>> feather.write_feather(table, "test_struct.feather")
> >>> table2 = feather.read_table("test_struct.feather")
> >>> table2.column("col")
> <pyarrow.lib.ChunkedArray object at 0x7f0b0c4d7728>
> [
> -- is_valid: all not null
> -- child 0 type: int64
> [
> 24,
> 1261641627085906436,
> 1369095386551025664
> ]
> -- child 1 type: int64
> [
> 24,
> 1405756815161762308,
> 281479842103296
> ]
> ]
> {code}
> When not using compression, it is read back correctly:
> {code:python}
> >>> feather.write_feather(table, "test_struct.feather",
> >>> compression="uncompressed")
> >>>
> >>>
> >>> table2 = feather.read_table("test_struct.feather")
> >>>
> >>>
> >>> table2.column("col")
> >>>
> >>>
> <pyarrow.lib.ChunkedArray object at 0x7f0b0e466778>
> [
> -- is_valid: all not null
> -- child 0 type: int64
> [
> 0,
> 1,
> 2
> ]
> -- child 1 type: int64
> [
> 1,
> 2,
> 3
> ]
> ]
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)