devinjdangelo commented on code in PR #10767: URL: https://github.com/apache/datafusion/pull/10767#discussion_r1623447090
########## datafusion/sql/src/unparser/expr.rs: ########## @@ -513,20 +513,30 @@ impl Unparser<'_> { fn convert_bound( &self, bound: &datafusion_expr::window_frame::WindowFrameBound, - ) -> ast::WindowFrameBound { + ) -> Result<ast::WindowFrameBound> { match bound { datafusion_expr::window_frame::WindowFrameBound::Preceding(val) => { - ast::WindowFrameBound::Preceding( - self.scalar_to_sql(val).map(Box::new).ok(), - ) + Ok(ast::WindowFrameBound::Preceding({ + let val = self.scalar_to_sql(val)?; Review Comment: There is a subtle difference in how datafusion plans a window frame bound that is `None` vs ScalarValue::Null. The former yields `PRECEDING UNBOUNDED` and the latter yields `PRECEDING NULL`. Datafusion's planner accepts the former, but rejects the latter. ########## datafusion/sql/tests/cases/plan_to_sql.rs: ########## @@ -127,7 +127,10 @@ fn roundtrip_statement() -> Result<()> { UNION ALL SELECT j2_string as string FROM j2 ORDER BY string DESC - LIMIT 10"# + LIMIT 10"#, + "SELECT id, count(*) over (PARTITION BY first_name ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), Review Comment: The roundtrip test will fail if `ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING` is not explicitly included. E.g. the datafusion planner generates a non identical plan for the following two SQL queries: ```sql SELECT id, count(*) over (PARTITION BY first_name ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), last_name, sum(id) over (PARTITION BY first_name ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), first_name from person ``` vs ```sql SELECT id, count(*) over (PARTITION BY first_name), last_name, sum(id) over (PARTITION BY first_name), first_name from person ``` While the two plans are not identical `p1!=p2`, I believe the difference is trivial and will actually result in the same computations taking place. ########## datafusion/sql/src/unparser/expr.rs: ########## @@ -1148,7 +1158,7 @@ mod tests { window_frame: WindowFrame::new(None), null_treatment: None, }), - r#"ROW_NUMBER(col) OVER (ROWS BETWEEN NULL PRECEDING AND NULL FOLLOWING)"#, + r#"ROW_NUMBER(col) OVER (ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)"#, Review Comment: See comment on L520 for explanation of this test change. ########## datafusion/sql/src/unparser/plan.rs: ########## @@ -162,23 +162,40 @@ impl Unparser<'_> { // A second projection implies a derived tablefactor if !select.already_projected() { // Special handling when projecting an agregation plan - if let Some(agg) = find_agg_node_within_select(plan, true) { - let items = p - .expr - .iter() - .map(|proj_expr| { - let unproj = unproject_agg_exprs(proj_expr, agg)?; - self.select_item_to_sql(&unproj) - }) - .collect::<Result<Vec<_>>>()?; - - select.projection(items); - select.group_by(ast::GroupByExpr::Expressions( - agg.group_expr - .iter() - .map(|expr| self.expr_to_sql(expr)) - .collect::<Result<Vec<_>>>()?, - )); + if let Some(aggvariant) = find_agg_node_within_select(plan, true) { Review Comment: The `select_to_sql_recursively` method has grown deeply nested/complex and is due for a refactor or at least breaking up into more helper methods to improve readability. ########## datafusion/sql/src/unparser/utils.rs: ########## @@ -20,16 +20,23 @@ use datafusion_common::{ tree_node::{Transformed, TreeNode}, Result, }; -use datafusion_expr::{Aggregate, Expr, LogicalPlan}; +use datafusion_expr::{Aggregate, Expr, LogicalPlan, Window}; -/// Recursively searches children of [LogicalPlan] to find an Aggregate node if one exists +/// One of the possible aggregation plans which can be found within a single select query. +pub(crate) enum AggVariant<'a> { Review Comment: This assumes a SELECT query can exclusively have only a window function or an aggregate function but not both. A LogicalPlan can certainly have both, but I could not find an example of a single SELECT query without any nesting/derived table factors that is allowed to have both. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org