jonkeane opened a new pull request #10780:
URL: https://github.com/apache/arrow/pull/10780
A proposed interface for using DuckDB + Arrow together.
I've added two methods:
* The proposed `summarise(..., .engine = "duckdb")` method which is
(probably) the method that people want to use
* A lower-level method of specifying exactly when the transfer takes
place. I've called this `alchemize_*` for now, though we might consider wedging
it into `collect()` or `compute()` (or something like `collect_to_duckdb()` to
be super explicit[1]).
* I've made a proof-of-concept that the `alchemize_*` can also work with
Python — this is basically a renaming/wrapping of `r_to_py` / `py_to_r`. If we
do peruse exposing `alchemize_*` or the like, I will fill out the rest of these
(we should keep both around, though r_to_py isn't currently documented so
probably isn't getting much use).
[1] I've tried both a more magical `alchemize(x, to = c("arrow", "duckdb",
"python))` the changes behavior / output based on the `to` argument, which we
can go back to if we want that simplicity, but I found it harder to reason
about what I was getting out. Where as with `alchemize_to_duckdb()` the
function says exactly what's going on.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]