[
https://issues.apache.org/jira/browse/ARROW-13531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17461583#comment-17461583
]
Dewey Dunnington commented on ARROW-13531:
------------------------------------------
Reprex:
{code:R}
library(arrow, warn.conflicts = FALSE)
tf <- tempfile()
write("col1;col2\n1,23;val1\n4,56;val2\n", tf)
# how it's done elswhere
read.csv2(tf)
#> col1 col2
#> 1 1.23 val1
#> 2 4.56 val2
readr::read_csv2(tf, show_col_types = FALSE)
#> ℹ Using "','" as decimal and "'.'" as grouping mark. Use `read_delim()` for
more control.
#> # A tibble: 2 × 2
#> col1 col2
#> <dbl> <chr>
#> 1 1.23 val1
#> 2 4.56 val2
readr::read_delim(
tf,
delim = ";",
locale = readr::locale(decimal_mark = ","),
show_col_types = FALSE
)
#> # A tibble: 2 × 2
#> col1 col2
#> <dbl> <chr>
#> 1 1.23 val1
#> 2 4.56 val2
# possible syntax in arrow::read_csv_arrow()
read_csv_arrow(
tf,
parse_options = CsvParseOptions$create(delimiter = ";"),
convert_options = CsvConvertOptions$create(decimal_point = ",")
)
#> Error in CsvConvertOptions$create(decimal_point = ","): unused argument
(decimal_point = ",")
read_csv2_arrow(tf)
#> Error in read_csv2_arrow(tf): could not find function "read_csv2_arrow"
{code}
Where the CsvConvertOptions are defined:
https://github.com/apache/arrow/blob/670af338bc740888bffea65b28ee2bcc065b555a/r/R/csv.R#L526-L559
https://github.com/apache/arrow/blob/670af338bc740888bffea65b28ee2bcc065b555a/r/src/csv.cpp#L79-L149
https://github.com/apache/arrow/blob/670af338bc740888bffea65b28ee2bcc065b555a/cpp/src/arrow/csv/options.h#L107-L108
> [R] Read CSV with comma as decimal mark
> ---------------------------------------
>
> Key: ARROW-13531
> URL: https://issues.apache.org/jira/browse/ARROW-13531
> Project: Apache Arrow
> Issue Type: New Feature
> Components: R
> Reporter: Neal Richardson
> Priority: Major
> Fix For: 7.0.0
>
>
> Followup to ARROW-13421. There is a new ConvertOption, that part is easy.
> There may be some subtleties in emulating the readr way of supporting this
> since it uses a broader {{locale()}} object, but maybe we just add
> {{read_csv2_arrow}} (matching {{readr::read_csv2}} and {{base::read.csv2}})
> and that's enough.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)