devinjdangelo commented on code in PR #10767:
URL: https://github.com/apache/datafusion/pull/10767#discussion_r1623447090


##########
datafusion/sql/src/unparser/expr.rs:
##########
@@ -513,20 +513,30 @@ impl Unparser<'_> {
     fn convert_bound(
         &self,
         bound: &datafusion_expr::window_frame::WindowFrameBound,
-    ) -> ast::WindowFrameBound {
+    ) -> Result<ast::WindowFrameBound> {
         match bound {
             datafusion_expr::window_frame::WindowFrameBound::Preceding(val) => 
{
-                ast::WindowFrameBound::Preceding(
-                    self.scalar_to_sql(val).map(Box::new).ok(),
-                )
+                Ok(ast::WindowFrameBound::Preceding({
+                    let val = self.scalar_to_sql(val)?;

Review Comment:
   There is a subtle difference in how datafusion plans a window frame bound 
that is `None` vs ScalarValue::Null.
   
   The former yields `PRECEDING UNBOUNDED` and the latter yields `PRECEDING 
NULL`.
   
   Datafusion's planner accepts the former, but rejects the latter.



##########
datafusion/sql/tests/cases/plan_to_sql.rs:
##########
@@ -127,7 +127,10 @@ fn roundtrip_statement() -> Result<()> {
             UNION ALL
             SELECT j2_string as string FROM j2
             ORDER BY string DESC
-            LIMIT 10"#
+            LIMIT 10"#,
+            "SELECT id, count(*) over (PARTITION BY first_name ROWS BETWEEN 
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), 

Review Comment:
   The roundtrip test will fail if `ROWS BETWEEN UNBOUNDED PRECEDING AND 
UNBOUNDED FOLLOWING` is not explicitly included. E.g. the datafusion planner 
generates a non identical plan for the following two SQL queries:
   
   ```sql
   SELECT id, 
   count(*) over (PARTITION BY first_name ROWS BETWEEN UNBOUNDED PRECEDING AND 
UNBOUNDED FOLLOWING), 
   last_name, 
   sum(id) over (PARTITION BY first_name ROWS BETWEEN UNBOUNDED PRECEDING AND 
UNBOUNDED FOLLOWING),
   first_name from person
   ```
   vs
   
   ```sql
   SELECT id, 
   count(*) over (PARTITION BY first_name), 
   last_name, 
   sum(id) over (PARTITION BY first_name),
   first_name from person
   ```
   While the two plans are not identical `p1!=p2`, I believe the difference is 
trivial and will actually result in the same computations taking place.



##########
datafusion/sql/src/unparser/expr.rs:
##########
@@ -1148,7 +1158,7 @@ mod tests {
                     window_frame: WindowFrame::new(None),
                     null_treatment: None,
                 }),
-                r#"ROW_NUMBER(col) OVER (ROWS BETWEEN NULL PRECEDING AND NULL 
FOLLOWING)"#,
+                r#"ROW_NUMBER(col) OVER (ROWS BETWEEN UNBOUNDED PRECEDING AND 
UNBOUNDED FOLLOWING)"#,

Review Comment:
   See comment on L520 for explanation of this test change.



##########
datafusion/sql/src/unparser/plan.rs:
##########
@@ -162,23 +162,40 @@ impl Unparser<'_> {
                 // A second projection implies a derived tablefactor
                 if !select.already_projected() {
                     // Special handling when projecting an agregation plan
-                    if let Some(agg) = find_agg_node_within_select(plan, true) 
{
-                        let items = p
-                            .expr
-                            .iter()
-                            .map(|proj_expr| {
-                                let unproj = unproject_agg_exprs(proj_expr, 
agg)?;
-                                self.select_item_to_sql(&unproj)
-                            })
-                            .collect::<Result<Vec<_>>>()?;
-
-                        select.projection(items);
-                        select.group_by(ast::GroupByExpr::Expressions(
-                            agg.group_expr
-                                .iter()
-                                .map(|expr| self.expr_to_sql(expr))
-                                .collect::<Result<Vec<_>>>()?,
-                        ));
+                    if let Some(aggvariant) = 
find_agg_node_within_select(plan, true) {

Review Comment:
   The `select_to_sql_recursively` method has grown deeply nested/complex and 
is due for a refactor or at least breaking up into more helper methods to 
improve readability.



##########
datafusion/sql/src/unparser/utils.rs:
##########
@@ -20,16 +20,23 @@ use datafusion_common::{
     tree_node::{Transformed, TreeNode},
     Result,
 };
-use datafusion_expr::{Aggregate, Expr, LogicalPlan};
+use datafusion_expr::{Aggregate, Expr, LogicalPlan, Window};
 
-/// Recursively searches children of [LogicalPlan] to find an Aggregate node 
if one exists
+/// One of the possible aggregation plans which can be found within a single 
select query.
+pub(crate) enum AggVariant<'a> {

Review Comment:
   This assumes a SELECT query can exclusively have only a window function or 
an aggregate function but not both. A LogicalPlan can certainly have both, but 
I could not find an example of a single SELECT query without any 
nesting/derived table factors that is allowed to have both.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to