[
https://issues.apache.org/jira/browse/ARROW-16701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17544617#comment-17544617
]
Ian Cook commented on ARROW-16701:
----------------------------------
IMO the description above conflates {{to_*()}} functions (which control which
backend should execute the next verbs in the chain) with a hypothetical
{{sql()}} function (which I strongly believe should operate like a pure dplyr
verb, and could at least hypothetically work with any backend).
> [R] Can we execute SQL in a dplyr pipeline?
> -------------------------------------------
>
> Key: ARROW-16701
> URL: https://issues.apache.org/jira/browse/ARROW-16701
> Project: Apache Arrow
> Issue Type: New Feature
> Components: R
> Reporter: Jonathan Keane
> Priority: Major
>
> Now that we have {{to_duckdb()}} and {{to_arrow()}} is it possible to wrap
> those and allow someone to insert arbitrary SQL into a dplyr query?
> Something like:
> {code:r}
> sql <- function(data, sql) {
> tbl <- to_duckdb(data)
> res <- DBI::dbSendQuery(dbplyr::remote_con(.data), sql, arrow = TRUE)
> duckdb::duckdb_fetch_record_batch(res)
> }
> ds %>%
> filter(year > 2020) %>%
> sql("SELECT tip_amount, fare_amount, total_amount FROM ") %>%
> compute()
> {code}
> This won't work totally, but is vaguely what we're looking for.
> One part that we need to think about is how to deal with the {{from}} clause,
> a few possibilities:
> * ibis does this by making you "name" the table before doing sql so you can
> FROM explicitly
> * though maybe you could get away with FROM . like it is a magrittr thing and
> sub that
> * empty string, and we add it in based on the lazy_tbl object
> Possibly related prior art:
> https://dbplyr.tidyverse.org/reference/build_sql.html (though the name isn't
> perfect IMO, and I think this is more geared towards package developers than
> end users?)
--
This message was sent by Atlassian Jira
(v8.20.7#820007)