Copilot commented on code in PR #662:
URL: https://github.com/apache/sedona-db/pull/662#discussion_r2850401803
##########
r/sedonadb/man/sd_sql.Rd:
##########
@@ -5,13 +5,15 @@
\alias{sd_ctx_sql}
\title{Create a DataFrame from SQL}
\usage{
-sd_sql(sql)
+sd_sql(sql, ..., params = NULL)
-sd_ctx_sql(ctx, sql)
+sd_ctx_sql(ctx, sql, ..., params = NULL)
}
\arguments{
\item{sql}{A SQL string to execute}
+\item{params}{A list of parameters to fill placeholders in the query.}
+
\item{ctx}{A SedonaDB context.}
}
Review Comment:
The documentation is missing the ... parameter that appears in the function
signature. Although it's only used internally for check_dots_empty(), it should
be documented in the `@param` section to match the function signature, or the
function should use `@inheritDotParams` if appropriate.
##########
r/sedonadb/R/context.R:
##########
@@ -146,22 +137,32 @@ sd_ctx_read_parquet <- function(ctx, path) {
#'
#' @param ctx A SedonaDB context.
#' @param sql A SQL string to execute
+#' @param params A list of parameters to fill placeholders in the query.
Review Comment:
The documentation is missing the ... parameter that appears in the function
signature. Although it's only used internally for check_dots_empty(), it should
be documented in the `@param` section to match the function signature. Consider
adding something like: `@param` ... These dots are for future extensions and
currently must be empty.
```suggestion
#' @param params A list of parameters to fill placeholders in the query.
#' @param ... These dots are for future extensions and currently must be
empty.
```
##########
r/sedonadb/src/rust/src/expression.rs:
##########
@@ -206,6 +206,43 @@ impl SedonaDBExprFactory {
})
.collect()
}
+
+ pub fn param_values(exprs_sexp: savvy::Sexp) -> savvy::Result<ParamValues>
{
+ let literals = savvy::ListSexp::try_from(exprs_sexp)?
+ .iter()
+ .map(
+ |(name, item)| -> savvy::Result<(String, ScalarAndMetadata)> {
+ // item here is the Environment wrapper around the
external pointer
+ let expr_wrapper: &SedonaDBExpr =
+ EnvironmentSexp::try_from(item)?.try_into()?;
+ if let Expr::Literal(scalar, metadata) =
&expr_wrapper.inner {
+ Ok((
+ name.to_string(),
+ ScalarAndMetadata::new(scalar.clone(),
metadata.clone()),
+ ))
+ } else {
+ Err(savvy_err!(
+ "Expected literal expression but got {:?}",
+ expr_wrapper.inner
+ ))
+ }
+ },
+ )
+ .collect::<savvy::Result<Vec<_>>>()?;
+
+ let has_names = literals.iter().any(|(name, _)| !name.is_empty());
+ if literals.is_empty() || has_names {
+ if !literals.iter().all(|(name, _)| !name.is_empty()) {
+ return Err(savvy_err!("params must be all named or all
unnamed"));
+ }
+
+ Ok(ParamValues::Map(literals.into_iter().rev().collect()))
Review Comment:
The reversal of named parameters with .rev() appears questionable. When
converting to ParamValues::Map, the named parameters are reversed. This could
potentially cause issues if the underlying DataFusion ParamValues::Map
implementation expects parameters in a specific order, though for a
HashMap-like structure, order typically shouldn't matter. However, if this is
intentional (e.g., to compensate for R's list iteration order), it should be
documented with a comment explaining why the reversal is necessary.
```suggestion
Ok(ParamValues::Map(literals.into_iter().collect()))
```
##########
r/sedonadb/man/sd_with_params.Rd:
##########
@@ -0,0 +1,28 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/dataframe.R
+\name{sd_with_params}
+\alias{sd_with_params}
+\title{Fill in placeholders}
+\usage{
+sd_with_params(.data, ...)
+}
+\arguments{
+\item{.data}{A sedonadb_dataframe or an object that can be coerced to one.}
+
+\item{...}{Named or unnamed parameters that will be coerced to literals
+with \code{\link[=as_sedonadb_literal]{as_sedonadb_literal()}}.}
+}
+\value{
+The number of rows after executing the query
Review Comment:
The return value documentation is incorrect. This should document that the
function returns a sedonadb_dataframe, not "the number of rows after executing
the query". This matches the same issue in the R source file.
```suggestion
A sedonadb_dataframe.
```
##########
r/sedonadb/tests/testthat/test-dataframe.R:
##########
@@ -83,6 +83,27 @@ test_that("dataframe rows can be counted", {
expect_identical(sd_count(df), 1)
})
+test_that("sd_with_params() fills in placeholder values", {
+ df <- sd_sql("SELECT $1 + 1 AS two") |> sd_with_params(1)
+ expect_identical(sd_collect(df), data.frame(two = 2))
+
+ df <- sd_sql("SELECT $x + 1 AS two") |> sd_with_params(x = 1)
+ expect_identical(sd_collect(df), data.frame(two = 2))
+
+ # Check that an error occurs for missing parameters
+ expect_snapshot_error(
+ sd_sql("SELECT $x + 1 AS two") |> sd_with_params()
+ )
+ expect_snapshot_error(
+ sd_sql("SELECT $1 + 1 AS two") |> sd_with_params()
+ )
+
+ # Check error for mixed named/unnamed
+ expect_snapshot_error(
+ sd_sql("SELECT $x + 1 AS two") |> sd_with_params(x = 1, 2)
+ )
+})
Review Comment:
There's no test coverage for verifying that multiple named parameters are
passed in the correct order. Consider adding a test like: sd_sql("SELECT $a ||
$b AS result") |> sd_with_params(a = "first", b = "second") and verifying the
result is "firstsecond" rather than "secondfirst". This would help ensure the
parameter order is preserved correctly through the Rust implementation.
##########
r/sedonadb/R/dataframe.R:
##########
@@ -116,6 +116,31 @@ sd_count <- function(.data) {
.data$df$count()
}
+#' Fill in placeholders
+#'
+#' This is a slightly more verbose form of [sd_sql()] with `params` that is
+#' useful if a data frame is to be repeatedly queried.
+#'
+#' @inheritParams sd_count
+#' @param ... Named or unnamed parameters that will be coerced to literals
+#' with [as_sedonadb_literal()].
+#'
+#' @returns The number of rows after executing the query
Review Comment:
The `@returns` documentation is incorrect. This function returns a
sedonadb_dataframe, not "the number of rows after executing the query". The
correct return value should match the actual behavior of the function.
```suggestion
#' @returns A sedonadb_dataframe with the provided parameters filled into
the query
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]