ghuls opened a new issue #286:
URL: https://github.com/apache/arrow-rs/issues/286


   **Describe the bug**
   
   Original bug report is here (agains polars, which was using arrow-rs for 
parsing Feather v2 files (IPC)):
   https://github.com/pola-rs/polars/issues/623
   
   Unable to load Feather v2 files created by pyarrow and pandas.
   
   Those files can be loaded fine by pyarrow and pandas itself.
   
   **To Reproduce**
   Steps to reproduce the behavior:
   
   Try to load the attached Feather files:
   
[test_feather_file.zip](https://github.com/apache/arrow-rs/files/6461057/test_feather_file.zip)
   )
   
   ```
   test_pandas.feather: Original Feather file
   test_arrow.feather: loading test_pandas.feather with pyarrow and saving with 
pyarrow: df_pa = pa.feather.read_feather('test_pandas.feather')
   test_polars.feather:  Loading test_pandas.feather with pyarrow and saving 
with polars (this one can be read by arrow-rs)
   test_pandas_from_polars.feather: Loading test_polars.feather with polars and 
using the to_pandas option.
   ```
   
   **Expected behavior**
   
   Feather v2 files can be opened by arrow-rs.
   
   **Additional context**
   
   ```python
   import polars as pl
   import pyarrow as pa
   import pandas as pd
   
   # Reading Feather file created with Pandas with pyarrow works fine.
   df_pa = pa.feather.read_feather('test_pandas.feather')
   
   # Write pyarrow dataframe to Feather file.
   df_pa.to_feather('test_arrow.feather')
   
   # Convert pyarrow dataframe to polars dataframe.
   df_pl = pl.DataFrame(df_pa)
   
   # Convert polars dataframe to pandas dataframe.
   df_pd = df_pl.to_pandas()
   
   # Write pandas dataframe  to feather file.
   df_pd.to_feather('test_pandas_from_polars.feather')
   
   
   In [88]: df_pa
   Out[88]: 
      motif1  motif2  motif3  motif4 regions
   0     1.2     3.0     0.3     5.6    reg1
   1     6.7     3.0     4.3     5.6    reg2
   2     3.5     3.0     0.0     0.0    reg3
   3     0.0     3.0     0.0     5.6    reg4
   4     2.4     3.0     7.8     1.2    reg5
   5     2.4     3.0     0.6     0.0    reg6
   6     2.4     3.0     7.7     0.0    reg7
   
   In [89]: df_pl
   Out[89]: 
   shape: (7, 5)
   ╭────────┬────────┬────────┬────────┬─────────╮
   │ motif1 ┆ motif2 ┆ motif3 ┆ motif4 ┆ regions │
   │ ---    ┆ ---    ┆ ---    ┆ ---    ┆ ---     │
   │ f64    ┆ f64    ┆ f64    ┆ f64    ┆ str     │
   ╞════════╪════════╪════════╪════════╪═════════╡
   │ 1.2    ┆ 3      ┆ 0.3    ┆ 5.6    ┆ "reg1"  │
   ├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤
   │ 6.7    ┆ 3      ┆ 4.3    ┆ 5.6    ┆ "reg2"  │
   ├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤
   │ 3.5    ┆ 3      ┆ 0.0    ┆ 0.0    ┆ "reg3"  │
   ├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤
   │ 0.0    ┆ 3      ┆ 0.0    ┆ 5.6    ┆ "reg4"  │
   ├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤
   │ 2.4    ┆ 3      ┆ 7.8    ┆ 1.2    ┆ "reg5"  │
   ├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤
   │ 2.4    ┆ 3      ┆ 0.6    ┆ 0.0    ┆ "reg6"  │
   ├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤
   │ 2.4    ┆ 3      ┆ 7.7    ┆ 0.0    ┆ "reg7"  │
   ╰────────┴────────┴────────┴────────┴─────────╯
   
   In [90]: df_pd
   Out[90]: 
      motif1  motif2  motif3  motif4 regions
   0     1.2     3.0     0.3     5.6    reg1
   1     6.7     3.0     4.3     5.6    reg2
   2     3.5     3.0     0.0     0.0    reg3
   3     0.0     3.0     0.0     5.6    reg4
   4     2.4     3.0     7.8     1.2    reg5
   5     2.4     3.0     0.6     0.0    reg6
   6     2.4     3.0     7.7     0.0    reg7
   
   
   
   In [103]: pl.read_ipc('test_polars.feather')
   Out[103]: 
   shape: (7, 5)
   ╭────────┬────────┬────────┬────────┬─────────╮
   │ motif1 ┆ motif2 ┆ motif3 ┆ motif4 ┆ regions │
   │ ---    ┆ ---    ┆ ---    ┆ ---    ┆ ---     │
   │ f64    ┆ f64    ┆ f64    ┆ f64    ┆ str     │
   ╞════════╪════════╪════════╪════════╪═════════╡
   │ 1.2    ┆ 3      ┆ 0.3    ┆ 5.6    ┆ "reg1"  │
   ├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤
   │ 6.7    ┆ 3      ┆ 4.3    ┆ 5.6    ┆ "reg2"  │
   ├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤
   │ 3.5    ┆ 3      ┆ 0.0    ┆ 0.0    ┆ "reg3"  │
   ├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤
   │ 0.0    ┆ 3      ┆ 0.0    ┆ 5.6    ┆ "reg4"  │
   ├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤
   │ 2.4    ┆ 3      ┆ 7.8    ┆ 1.2    ┆ "reg5"  │
   ├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤
   │ 2.4    ┆ 3      ┆ 0.6    ┆ 0.0    ┆ "reg6"  │
   ├╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┤
   │ 2.4    ┆ 3      ┆ 7.7    ┆ 0.0    ┆ "reg7"  │
   ╰────────┴────────┴────────┴────────┴─────────╯
   
   In [104]: pl.read_ipc('test_arrow.feather')
   thread '<unnamed>' panicked at 'assertion failed: prefix.is_empty() && 
suffix.is_empty()', 
/github/home/.cargo/git/checkouts/arrow-rs-3b86e19e889d5acc/d008f31/arrow/src/buffer/immutable.rs:179:9
   ---------------------------------------------------------------------------
   PanicException                            Traceback (most recent call last)
   <ipython-input-104-f9a22f9a0eb1> in <module>
   ----> 1 pl.read_ipc('test_arrow.feather')
   
   
~/software/anaconda3/envs/create_cistarget_databases/lib/python3.8/site-packages/polars/functions.py
 in read_ipc(file)
       278     """
       279     file = _prepare_file_arg(file)
   --> 280     return DataFrame.read_ipc(file)
       281 
       282 
   
   
~/software/anaconda3/envs/create_cistarget_databases/lib/python3.8/site-packages/polars/frame.py
 in read_ipc(file)
       235         """
       236         self = DataFrame.__new__(DataFrame)
   --> 237         self._df = PyDataFrame.read_ipc(file)
       238         return self
       239 
   
   PanicException: assertion failed: prefix.is_empty() && suffix.is_empty()
   
   In [105]: pl.read_ipc('test_pandas.feather')
   thread '<unnamed>' panicked at 'assertion failed: prefix.is_empty() && 
suffix.is_empty()', 
/github/home/.cargo/git/checkouts/arrow-rs-3b86e19e889d5acc/d008f31/arrow/src/buffer/immutable.rs:179:9
   ---------------------------------------------------------------------------
   PanicException                            Traceback (most recent call last)
   <ipython-input-105-35809d9ae65f> in <module>
   ----> 1 pl.read_ipc('test_pandas.feather')
   
   
~/software/anaconda3/envs/create_cistarget_databases/lib/python3.8/site-packages/polars/functions.py
 in read_ipc(file)
       278     """
       279     file = _prepare_file_arg(file)
   --> 280     return DataFrame.read_ipc(file)
       281 
       282 
   
   
~/software/anaconda3/envs/create_cistarget_databases/lib/python3.8/site-packages/polars/frame.py
 in read_ipc(file)
       235         """
       236         self = DataFrame.__new__(DataFrame)
   --> 237         self._df = PyDataFrame.read_ipc(file)
       238         return self
       239 
   
   PanicException: assertion failed: prefix.is_empty() && suffix.is_empty()
   
   In [106]: pl.read_ipc('test_pandas_from_polars.feather')
   thread '<unnamed>' panicked at 'assertion failed: prefix.is_empty() && 
suffix.is_empty()', 
/github/home/.cargo/git/checkouts/arrow-rs-3b86e19e889d5acc/d008f31/arrow/src/buffer/immutable.rs:179:9
   ---------------------------------------------------------------------------
   PanicException                            Traceback (most recent call last)
   <ipython-input-107-d0a17f51c6ac> in <module>
   ----> 1 pl.read_ipc('test_pandas_from_polars.feather')
   
   
~/software/anaconda3/envs/create_cistarget_databases/lib/python3.8/site-packages/polars/functions.py
 in read_ipc(file)
       278     """
       279     file = _prepare_file_arg(file)
   --> 280     return DataFrame.read_ipc(file)
       281 
       282 
   
   
~/software/anaconda3/envs/create_cistarget_databases/lib/python3.8/site-packages/polars/frame.py
 in read_ipc(file)
       235         """
       236         self = DataFrame.__new__(DataFrame)
   --> 237         self._df = PyDataFrame.read_ipc(file)
       238         return self
       239 
   
   PanicException: assertion failed: prefix.is_empty() && suffix.is_empty()
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to