ghuls opened a new issue #286: URL: https://github.com/apache/arrow-rs/issues/286
**Describe the bug** Original bug report is here (agains polars, which was using arrow-rs for parsing Feather v2 files (IPC)): https://github.com/pola-rs/polars/issues/623 Unable to load Feather v2 files created by pyarrow and pandas. Those files can be loaded fine by pyarrow and pandas itself. **To Reproduce** Steps to reproduce the behavior: Try to load the attached Feather files: [test_feather_file.zip](https://github.com/apache/arrow-rs/files/6461057/test_feather_file.zip) ) ``` test_pandas.feather: Original Feather file test_arrow.feather: loading test_pandas.feather with pyarrow and saving with pyarrow: df_pa = pa.feather.read_feather('test_pandas.feather') test_polars.feather: Loading test_pandas.feather with pyarrow and saving with polars (this one can be read by arrow-rs) test_pandas_from_polars.feather: Loading test_polars.feather with polars and using the to_pandas option. ``` **Expected behavior** Feather v2 files can be opened by arrow-rs. **Additional context** ```python import polars as pl import pyarrow as pa import pandas as pd # Reading Feather file created with Pandas with pyarrow works fine. df_pa = pa.feather.read_feather('test_pandas.feather') # Write pyarrow dataframe to Feather file. df_pa.to_feather('test_arrow.feather') # Convert pyarrow dataframe to polars dataframe. df_pl = pl.DataFrame(df_pa) # Convert polars dataframe to pandas dataframe. df_pd = df_pl.to_pandas() # Write pandas dataframe to feather file. df_pd.to_feather('test_pandas_from_polars.feather') In [88]: df_pa Out[88]: motif1 motif2 motif3 motif4 regions 0 1.2 3.0 0.3 5.6 reg1 1 6.7 3.0 4.3 5.6 reg2 2 3.5 3.0 0.0 0.0 reg3 3 0.0 3.0 0.0 5.6 reg4 4 2.4 3.0 7.8 1.2 reg5 5 2.4 3.0 0.6 0.0 reg6 6 2.4 3.0 7.7 0.0 reg7 In [89]: df_pl Out[89]: shape: (7, 5) ╭────────┬────────┬────────┬────────┬─────────╮ │ motif1 ┆ motif2 ┆ motif3 ┆ motif4 ┆ regions │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ f64 ┆ f64 ┆ f64 ┆ f64 ┆ str │ ╞════════╪════════╪════════╪════════╪═════════╡ │ 1.2 ┆ 3 ┆ 0.3 ┆ 5.6 ┆ "reg1" │ ├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤ │ 6.7 ┆ 3 ┆ 4.3 ┆ 5.6 ┆ "reg2" │ ├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤ │ 3.5 ┆ 3 ┆ 0.0 ┆ 0.0 ┆ "reg3" │ ├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤ │ 0.0 ┆ 3 ┆ 0.0 ┆ 5.6 ┆ "reg4" │ ├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤ │ 2.4 ┆ 3 ┆ 7.8 ┆ 1.2 ┆ "reg5" │ ├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤ │ 2.4 ┆ 3 ┆ 0.6 ┆ 0.0 ┆ "reg6" │ ├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤ │ 2.4 ┆ 3 ┆ 7.7 ┆ 0.0 ┆ "reg7" │ ╰────────┴────────┴────────┴────────┴─────────╯ In [90]: df_pd Out[90]: motif1 motif2 motif3 motif4 regions 0 1.2 3.0 0.3 5.6 reg1 1 6.7 3.0 4.3 5.6 reg2 2 3.5 3.0 0.0 0.0 reg3 3 0.0 3.0 0.0 5.6 reg4 4 2.4 3.0 7.8 1.2 reg5 5 2.4 3.0 0.6 0.0 reg6 6 2.4 3.0 7.7 0.0 reg7 In [103]: pl.read_ipc('test_polars.feather') Out[103]: shape: (7, 5) ╭────────┬────────┬────────┬────────┬─────────╮ │ motif1 ┆ motif2 ┆ motif3 ┆ motif4 ┆ regions │ │ --- ┆ --- ┆ --- ┆ --- ┆ --- │ │ f64 ┆ f64 ┆ f64 ┆ f64 ┆ str │ ╞════════╪════════╪════════╪════════╪═════════╡ │ 1.2 ┆ 3 ┆ 0.3 ┆ 5.6 ┆ "reg1" │ ├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤ │ 6.7 ┆ 3 ┆ 4.3 ┆ 5.6 ┆ "reg2" │ ├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤ │ 3.5 ┆ 3 ┆ 0.0 ┆ 0.0 ┆ "reg3" │ ├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤ │ 0.0 ┆ 3 ┆ 0.0 ┆ 5.6 ┆ "reg4" │ ├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤ │ 2.4 ┆ 3 ┆ 7.8 ┆ 1.2 ┆ "reg5" │ ├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤ │ 2.4 ┆ 3 ┆ 0.6 ┆ 0.0 ┆ "reg6" │ ├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤ │ 2.4 ┆ 3 ┆ 7.7 ┆ 0.0 ┆ "reg7" │ ╰────────┴────────┴────────┴────────┴─────────╯ In [104]: pl.read_ipc('test_arrow.feather') thread '<unnamed>' panicked at 'assertion failed: prefix.is_empty() && suffix.is_empty()', /github/home/.cargo/git/checkouts/arrow-rs-3b86e19e889d5acc/d008f31/arrow/src/buffer/immutable.rs:179:9 --------------------------------------------------------------------------- PanicException Traceback (most recent call last) <ipython-input-104-f9a22f9a0eb1> in <module> ----> 1 pl.read_ipc('test_arrow.feather') ~/software/anaconda3/envs/create_cistarget_databases/lib/python3.8/site-packages/polars/functions.py in read_ipc(file) 278 """ 279 file = _prepare_file_arg(file) --> 280 return DataFrame.read_ipc(file) 281 282 ~/software/anaconda3/envs/create_cistarget_databases/lib/python3.8/site-packages/polars/frame.py in read_ipc(file) 235 """ 236 self = DataFrame.__new__(DataFrame) --> 237 self._df = PyDataFrame.read_ipc(file) 238 return self 239 PanicException: assertion failed: prefix.is_empty() && suffix.is_empty() In [105]: pl.read_ipc('test_pandas.feather') thread '<unnamed>' panicked at 'assertion failed: prefix.is_empty() && suffix.is_empty()', /github/home/.cargo/git/checkouts/arrow-rs-3b86e19e889d5acc/d008f31/arrow/src/buffer/immutable.rs:179:9 --------------------------------------------------------------------------- PanicException Traceback (most recent call last) <ipython-input-105-35809d9ae65f> in <module> ----> 1 pl.read_ipc('test_pandas.feather') ~/software/anaconda3/envs/create_cistarget_databases/lib/python3.8/site-packages/polars/functions.py in read_ipc(file) 278 """ 279 file = _prepare_file_arg(file) --> 280 return DataFrame.read_ipc(file) 281 282 ~/software/anaconda3/envs/create_cistarget_databases/lib/python3.8/site-packages/polars/frame.py in read_ipc(file) 235 """ 236 self = DataFrame.__new__(DataFrame) --> 237 self._df = PyDataFrame.read_ipc(file) 238 return self 239 PanicException: assertion failed: prefix.is_empty() && suffix.is_empty() In [106]: pl.read_ipc('test_pandas_from_polars.feather') thread '<unnamed>' panicked at 'assertion failed: prefix.is_empty() && suffix.is_empty()', /github/home/.cargo/git/checkouts/arrow-rs-3b86e19e889d5acc/d008f31/arrow/src/buffer/immutable.rs:179:9 --------------------------------------------------------------------------- PanicException Traceback (most recent call last) <ipython-input-107-d0a17f51c6ac> in <module> ----> 1 pl.read_ipc('test_pandas_from_polars.feather') ~/software/anaconda3/envs/create_cistarget_databases/lib/python3.8/site-packages/polars/functions.py in read_ipc(file) 278 """ 279 file = _prepare_file_arg(file) --> 280 return DataFrame.read_ipc(file) 281 282 ~/software/anaconda3/envs/create_cistarget_databases/lib/python3.8/site-packages/polars/frame.py in read_ipc(file) 235 """ 236 self = DataFrame.__new__(DataFrame) --> 237 self._df = PyDataFrame.read_ipc(file) 238 return self 239 PanicException: assertion failed: prefix.is_empty() && suffix.is_empty() ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
