[
https://issues.apache.org/jira/browse/ARROW-14344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430772#comment-17430772
]
Weston Pace commented on ARROW-14344:
-------------------------------------
I did try and reproduce this with the latest pyarrow/arrow-R and was unable to
do so. My python script was:
{code:python}
import pandas as pd
import pyarrow as pa
import pyarrow.ipc as ipc
df = pd.DataFrame({})
table = pa.Table.from_pandas(df.reset_index(drop=True))
with ipc.RecordBatchFileWriter('empty.arrow', schema=table.schema) as writer:
writer.write_table(table)
{code}
Then for R I had
{code:r}
> library(arrow)
See arrow_info() for available features
Attaching package: ‘arrow’
The following object is masked from ‘package:utils’:
timestamp
> arrow::read_feather("empty.arrow")
data frame with 0 columns and 0 rows
{code}
I tried with pyarrow 3.0.0 and got the same results (they both created
identical empty files, except for the python/pandas version in the metadata)
> [R][Python] Crash when reading empty .feather file
> --------------------------------------------------
>
> Key: ARROW-14344
> URL: https://issues.apache.org/jira/browse/ARROW-14344
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python, R
> Environment: Ubuntu Server 20.04.3, arrow (R) 5.0.02, pyarrow 3.0.0
> (Python), RStudio 1.4.1717, R 4.1.0
> Reporter: Reinier van Linschoten
> Priority: Major
> Labels: R, arrow, bug, error, pandas, python
>
> I get an R Session Error in RStudio Server when I try to read an empty
> .feather file.
> Error: The previous R session was abnormally terminated due to an unexpected
> crash. You may have lost workspace data as a result of this crash.
> Reproduce:
> * Create empty pandas dataframe in Python
> * Write to .feather file with .reset_index(drop=True) and
> compression="uncompressed"
> * Try to read data in RStudio with arrow::read_feather(path)
> * Error
> I can read dataframes with one or more rows in RStudio.
> I can read the empty dataframe with pandas.read_feather(). This returns an
> empty pandas dataframe.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)