jonkeane commented on a change in pull request #11668:
URL: https://github.com/apache/arrow/pull/11668#discussion_r758638024
##########
File path: r/R/csv.R
##########
@@ -415,6 +415,16 @@ CsvReadOptions$create <- function(use_threads =
option_use_threads(),
)
}
+readr_to_csv_write_options <- function(include_header,
+ batch_size = 1024L) {
+ assert_that(is_integerish(batch_size, n = 1, finite = TRUE), batch_size > 0)
Review comment:
This is more a curiosity than anything else: what _would_ happen if I
gave an infinite batch size? I could imagine someone doing that if they
absolutely needed everything to we written in one batch but somehow couldn't
know the nrows before hand)
##########
File path: r/R/csv.R
##########
@@ -623,9 +638,57 @@ readr_to_csv_convert_options <- function(na,
#' @include arrow-package.R
write_csv_arrow <- function(x,
sink,
+ file = NULL,
include_header = TRUE,
- batch_size = 1024L) {
- write_options <- CsvWriteOptions$create(include_header, batch_size)
+ col_names = NULL,
+ batch_size = 1024L,
+ write_options = NULL,
+ ...) {
+ unsupported_passed_args <- names(list(...))
+
+ if (length(unsupported_passed_args)) {
+ stop(
+ "The following ",
+ ngettext(length(unsupported_passed_args), "argument is ", "arguments are
"),
+ "not yet supported in Arrow: ",
+ oxford_paste(unsupported_passed_args),
+ call. = FALSE
+ )
+ }
+
+ if (!missing(file) && !missing(sink)) {
+ stop(
+ "You have supplied both \"file\" and \"sink\" arguments. Please ",
+ "supply only one of them.",
+ call. = FALSE
+ )
+ }
+
+ if (missing(sink) && !missing(file)) {
+ sink <- file
+ }
+
+ if (!missing(col_names) && !missing(include_header)) {
+ stop(
+ "You have supplied both \"col_names\" and \"include_header\" ",
+ "arguments. Please supply only one of them.",
+ call. = FALSE
+ )
+ }
+
+ # default value are considered missing by base R
+ if (missing(include_header) && !missing(col_names)) {
+ message(
+ "You have supplied a value for \"col_names\". This will overwrite ",
+ "the value for the \"include_headers\" argument.")
+ include_header <- col_names
+ }
Review comment:
Am I reading this right that you get here if you specify
`write_csv_arrow(df, "file.csv", col_names = TRUE)`?
If so, I don't think we need a message there about replacing
`include_header` since it's not being specified / there's no conflict.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]