dgreiss commented on code in PR #36436:
URL: https://github.com/apache/arrow/pull/36436#discussion_r1291320492
##########
r/R/csv.R:
##########
@@ -510,32 +510,55 @@ CsvReadOptions$create <- function(use_threads =
option_use_threads(),
options
}
-readr_to_csv_write_options <- function(include_header = TRUE,
+readr_to_csv_write_options <- function(col_names = TRUE,
batch_size = 1024L,
- na = "") {
+ delim = ",",
+ na = "",
+ eol = "\n",
+ quote = "Needed") {
Review Comment:
I've went ahead and added the readr options and updated the docs and tests
to make sure we are testing the output. Notably[ Arrow quotes all strings and
binaries](https://github.com/apache/arrow/blob/a7d5217937ca8b8be77995910f672c5e61e9d699/cpp/src/arrow/csv/writer.h#L31-L39),
not just in cases where there is a quote or delimiter in the string:
```c++
// Functionality for converting Arrow data to Comma separated value text.
// This library supports all primitive types that can be cast to a
StringArrays.
// It applies to following formatting rules:
// - For non-binary types no quotes surround values. Nulls are represented
as the empty
// string.
// - For binary types all non-null data is quoted (and quotes within data
are escaped
// with an additional quote).
// Null values are empty and unquoted.
```
I've learned more than I expected about the Arrow CSV writer through this PR
😅. @thisisnic @paleolimbot let me know if there's anything else on this one.
##########
r/R/csv.R:
##########
@@ -510,32 +510,55 @@ CsvReadOptions$create <- function(use_threads =
option_use_threads(),
options
}
-readr_to_csv_write_options <- function(include_header = TRUE,
+readr_to_csv_write_options <- function(col_names = TRUE,
batch_size = 1024L,
- na = "") {
+ delim = ",",
+ na = "",
+ eol = "\n",
+ quote = "Needed") {
Review Comment:
I've went ahead and added the readr options and updated the docs and tests
to make sure we are testing the output. Notably[ Arrow quotes all strings and
binaries](https://github.com/apache/arrow/blob/a7d5217937ca8b8be77995910f672c5e61e9d699/cpp/src/arrow/csv/writer.h#L31-L39),
not just in cases where there is a quote or delimiter in the string:
```c++
// Functionality for converting Arrow data to Comma separated value text.
// This library supports all primitive types that can be cast to a
StringArrays.
// It applies to following formatting rules:
// - For non-binary types no quotes surround values. Nulls are represented
as the empty
// string.
// - For binary types all non-null data is quoted (and quotes within data
are escaped
// with an additional quote).
// Null values are empty and unquoted.
```
I've learned more than I expected about the Arrow CSV writer through this PR
😅. @thisisnic @paleolimbot let me know if there's anything else on this one.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]