Keith Hughitt created ARROW-7825:
------------------------------------
Summary: Have arrow::read_parquet respect options(stringsAsFactors
= FALSE)
Key: ARROW-7825
URL: https://issues.apache.org/jira/browse/ARROW-7825
Project: Apache Arrow
Issue Type: Improvement
Components: R
Affects Versions: 0.16.0
Environment: Linux 64-bit 5.4.15
Reporter: Keith Hughitt
Same issue as reported for feather::read_feather
(https://issues.apache.org/jira/browse/ARROW-7823);
For the R arrow package, the "read_parquet()" function currently does not
respect "options(stringsAsFactors = FALSE)", leading to unexpected/inconsistent
behavior.
*Example:*
{code:java}
library(arrow)
library(readr)
options(stringsAsFactors = FALSE)
write_tsv(head(iris), 'test.tsv')
write_parquet(head(iris), 'test.parquet')
head(read.delim('test.tsv', sep='\t')$Species)
# [1] "setosa" "setosa" "setosa" "setosa" "setosa" "setosa"
head(read_tsv('test.tsv', col_types = cols())$Species)
# [1] "setosa" "setosa" "setosa" "setosa" "setosa" "setosa"
head(read_parquet('test.parquet')$Species)
# [1] setosa setosa setosa setosa setosa setosa
# Levels: setosa versicolor virginica
{code}
*Versions:*
- R 3.6.2
- arrow_0.15.1.9000
--
This message was sent by Atlassian Jira
(v8.3.4#803005)