This is an automated email from the ASF dual-hosted git repository.
thisisnic pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/main by this push:
new 41c481f41a GH-44069: [Docs][R] Add note to to_arrow() docs about
collect/compute (#44094)
41c481f41a is described below
commit 41c481f41ad322341d0698d001b4af5c98c5dbac
Author: Bryce Mecum <[email protected]>
AuthorDate: Sat Sep 14 12:38:47 2024 -0700
GH-44069: [Docs][R] Add note to to_arrow() docs about collect/compute
(#44094)
### Rationale for this change
Improves the documentation for the `to_arrow()` function for the use case
referenced in https://github.com/apache/arrow/issues/44069.
### What changes are included in this PR?
Just docs.
### Are these changes tested?
Yes. Built and tested locally.
### Are there any user-facing changes?
Just docs.
* GitHub Issue: #44069
Authored-by: Bryce Mecum <[email protected]>
Signed-off-by: Nic Crane <[email protected]>
---
r/R/duckdb.R | 8 +++++++-
r/man/to_arrow.Rd | 9 ++++++++-
2 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/r/R/duckdb.R b/r/R/duckdb.R
index a2bf62de2f..65c70243e7 100644
--- a/r/R/duckdb.R
+++ b/r/R/duckdb.R
@@ -137,7 +137,13 @@ duckdb_disconnector <- function(con, tbl_name) {
#' Create an Arrow object from a DuckDB connection
#'
-#' This can be used in pipelines that pass data back and forth between Arrow
and DuckDB
+#' This can be used in pipelines that pass data back and forth between Arrow
and
+#' DuckDB.
+#'
+#' Note that you can only call `collect()` or `compute()` on the result of this
+#' function once. To work around this limitation, you should either only call
+#' `collect()` as the final step in a pipeline or call `as_arrow_table()` on
the
+#' result to materialize the entire Table in-memory.
#'
#' @param .data the object to be converted
#' @return A `RecordBatchReader`.
diff --git a/r/man/to_arrow.Rd b/r/man/to_arrow.Rd
index aed40609a5..87b8fea36e 100644
--- a/r/man/to_arrow.Rd
+++ b/r/man/to_arrow.Rd
@@ -13,7 +13,14 @@ to_arrow(.data)
A \code{RecordBatchReader}.
}
\description{
-This can be used in pipelines that pass data back and forth between Arrow and
DuckDB
+This can be used in pipelines that pass data back and forth between Arrow and
+DuckDB.
+}
+\details{
+Note that you can only call \code{collect()} or \code{compute()} on the result
of this
+function once. To work around this limitation, you should either only call
+\code{collect()} as the final step in a pipeline or call
\code{as_arrow_table()} on the
+result to materialize the entire Table in-memory.
}
\examples{
\dontshow{if (getFromNamespace("run_duckdb_examples", "arrow")()) (if
(getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf}