[
https://issues.apache.org/jira/browse/ARROW-16575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicola Crane updated ARROW-16575:
---------------------------------
Summary: [R] arrow::write_dataset() does nothing with 0 row dataframes in R
(was: arrow::write_dataset() does nothing with 0 row dataframes in R)
> [R] arrow::write_dataset() does nothing with 0 row dataframes in R
> ------------------------------------------------------------------
>
> Key: ARROW-16575
> URL: https://issues.apache.org/jira/browse/ARROW-16575
> Project: Apache Arrow
> Issue Type: Improvement
> Components: R
> Environment: Mac OS 12.3, R 4.1
> Reporter: Adam Black
> Priority: Minor
>
> In R a dataframe can have 0 rows. It still has column names and types.
>
> Expected behavior of arrow::write_dataset
> I would expect that it would be possible to have a FileSystemDataset with
> zero rows that would contain metadata about the column names and types.
> arrow::write_dataset would create the FileSystemDataset metadata when given a
> dataframe with zero rows.
>
> Actual behavior
> arrow::write_dataset() does nothing when passed a dataframe with zero rows.
>
> Reproducible example using the current arrow package on CRAN
> {code:java}
> arrow::write_dataset(cars, here::here("cars"))
> arrow::open_dataset(here::here("cars"))
> #> FileSystemDataset with 1 Parquet file
> #> speed: double
> #> dist: double
> #>
> #> See $metadata for additional Schema metadata
> file.exists(here::here("cars"))
> #> [1] TRUE
> df <- cars[cars$speed > 1000, ]
> nrow(df)
> #> [1] 0
> arrow::write_dataset(df, here::here("df"), format = "feather")
> arrow::open_dataset(here::here("df"))
> #> Error: IOError: Cannot list directory
> '/private/var/folders/xx/01v98b6546ldnm1rg1_bvk000000gn/T/RtmpGkX0gK/reprex-17c305ed29ad5-nerdy-ram/df'.
> Detail: [errno 2] No such file or directory
> file.exists(here::here("df"))
> #> [1] FALSE{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)