Nic Crane created ARROW-13278:
---------------------------------

             Summary: [R] open_dataset autodetects types wrong in fairly 
unambiguous data
                 Key: ARROW-13278
                 URL: https://issues.apache.org/jira/browse/ARROW-13278
             Project: Apache Arrow
          Issue Type: Bug
          Components: R
            Reporter: Nic Crane
            Assignee: Nic Crane


 
{code:java}
# Write some partitioned data to disk to read back in
write_dataset(airquality, "airquality_partitioned", partitioning = c("Month", 
"Day"))

# Read data from folder
air_data <- open_dataset("airquality_partitioned", partitioning = c("Month", 
"Day"))

> air_data
FileSystemDataset with 153 Parquet files
Ozone: int32
Solar.R: int32
Wind: double
Temp: int32
Month: string
Day: string{code}
Month and Day are integers and there are no NA values in these columns of the 
data so, given the docs for open_dataset say that partitioning can be supplied 
as "a character vector that defines the field names corresponding to those path 
segments (that is, you're providing the names that would correspond to a Schema 
but the types will be autodetected)", this looks like it might be a bug 
somewhere.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to