Adam Black created ARROW-16575:
----------------------------------

             Summary: arrow::write_dataset() does nothing with 0 row dataframes 
in R
                 Key: ARROW-16575
                 URL: https://issues.apache.org/jira/browse/ARROW-16575
             Project: Apache Arrow
          Issue Type: Improvement
         Environment: Mac OS 12.3, R 4.1
            Reporter: Adam Black


In R a dataframe can have 0 rows. It still has column names and types. 

 

Expected behavior of arrow::write_dataset

I would expect that it would be possible to have a FileSystemDataset with zero 
rows that would contain metadata about the column names and types. 
arrow::write_dataset would create the FileSystemDataset metadata when given a 
dataframe with zero rows.

 

Actual behavior

arrow::write_dataset() does nothing when passed a dataframe with zero rows.

 

Reproducible example using the current arrow package on CRAN
{code:java}
arrow::write_dataset(cars, here::here("cars"))
arrow::open_dataset(here::here("cars"))
#> FileSystemDataset with 1 Parquet file
#> speed: double
#> dist: double
#> 
#> See $metadata for additional Schema metadata
file.exists(here::here("cars"))
#> [1] TRUE




df <- cars[cars$speed > 1000, ]
nrow(df)
#> [1] 0
arrow::write_dataset(df, here::here("df"), format = "feather")
arrow::open_dataset(here::here("df"))
#> Error: IOError: Cannot list directory 
'/private/var/folders/xx/01v98b6546ldnm1rg1_bvk000000gn/T/RtmpGkX0gK/reprex-17c305ed29ad5-nerdy-ram/df'.
 Detail: [errno 2] No such file or directory
file.exists(here::here("df"))
#> [1] FALSE{code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to