nbenn opened a new issue, #1397:
URL: https://github.com/apache/arrow-adbc/issues/1397
Currently there is no way to determine what types of placeholders a backend
supports for parametrized queries.
SQLite for example is quite flexible, supporting all of "?", "$i", "$name",
":name" (where "i" represents a parameter position and "name" a parameter
name), while Postgres (I believe) only supports "?" and "$i" and cannot handle
named parameters.
In what way this information could be communicated and whether it is
worthwhile to do so is currently unclear to me. I'm hoping that this could be a
place to have such a discussion.
Maybe I can start out explaining a bit why this became relevant tom me and
then we see where this goes. It might be that we can sort out my specific
issues without introducing such infrastructure at all. Of course this might be
interesting to others nonetheless.
For context: I'm writing a high level wrapper in R that aligns the DBI
interface with functionality exposed by adbcdrivermanager. As example, consider
the following query:
```r
library(adbcdrivermanager)
db <- adbc_database_init(adbcsqlite::adbcsqlite(), uri = ":memory:")
con <- adbc_connection_init(db)
write_adbc(datasets::swiss, con, "swiss", temporary = TRUE)
stmt <- adbc_statement_init(con)
adbc_statement_set_sql_query(
stmt,
"SELECT * from swiss WHERE Agriculture < $agr AND Education > $edu"
)
adbc_statement_prepare(stmt)
```
We have a "parameter schema" with parameter names
```r
names(adbc_statement_get_parameter_schema(stmt)$children)
#> [1] "$agr" "$edu"
```
At the same time, I want my users to be able to pass parameter values by
referring to more natural names such as "agr" and "edu". Something like
```r
execute_query <- function(...) {
params <- data.frame(...)
adbc_statement_bind_stream(stmt, params)
res <- nanoarrow::nanoarrow_allocate_array_stream()
adbc_statement_execute_query(stmt, res)
as.data.frame(res)
}
nrow(
execute_query(agr = 30, edu = 10)
)
#> [1] 9
```
As things stand, it seems that for named parameters, names are not
considered and only positions are relevant. This makes the current
implementation of parameter binding (at least for SQLite) quite lenient.
```r
nrow(execute_query(agr = 30, typo = 10))
#> [1] 9
nrow(execute_query(edu = 10, agr = 30))
#> [1] 1
nrow(execute_query(agr = 30, edu = "typo"))
#> [1] 0
```
If I want to properly support named parameters, I have to sort out ordering
on my end, as well as potentially raising errors for name/type mismatches. To
do this, I need to match names in the parameter schema with my "natural" names.
And this is where having information on what kinds of placeholder patterns are
supported by the back-end becomes relevant to me.
Cc @krlmlr @paleolimbot
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]