jonkeane opened a new pull request #10780: URL: https://github.com/apache/arrow/pull/10780
A proposed interface for using DuckDB + Arrow together. I've added two methods: * The proposed `summarise(..., .engine = "duckdb")` method which is (probably) the method that people want to use * A lower-level method of specifying exactly when the transfer takes place. I've called this `alchemize_*` for now, though we might consider wedging it into `collect()` or `compute()` (or something like `collect_to_duckdb()` to be super explicit[1]). * I've made a proof-of-concept that the `alchemize_*` can also work with Python — this is basically a renaming/wrapping of `r_to_py` / `py_to_r`. If we do peruse exposing `alchemize_*` or the like, I will fill out the rest of these (we should keep both around, though r_to_py isn't currently documented so probably isn't getting much use). [1] I've tried both a more magical `alchemize(x, to = c("arrow", "duckdb", "python))` the changes behavior / output based on the `to` argument, which we can go back to if we want that simplicity, but I found it harder to reason about what I was getting out. Where as with `alchemize_to_duckdb()` the function says exactly what's going on. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org