alamb commented on code in PR #13039:
URL: https://github.com/apache/datafusion/pull/13039#discussion_r1809481258


##########
datafusion/expr/src/logical_plan/plan.rs:
##########
@@ -2889,7 +2890,9 @@ impl DistinctOn {
         // Check that the left-most sort expressions are the same as the `ON` 
expressions.
         let mut matched = true;
         for (on, sort) in self.on_expr.iter().zip(sort_expr.iter()) {
-            if on != &sort.expr {
+            if normalize_col(on.clone(), &self.input)?

Review Comment:
   It seems like the `sort` is already normalized; 
https://github.com/apache/datafusion/blob/2e274bfbdfc19f11b2951456bb2a48e88733c9bf/datafusion/expr/src/expr_rewriter/mod.rs#L127-L128
   
   Likewise isn't `self.on_exprs` is also already normalized ?
   
   
https://github.com/apache/datafusion/blob/c3a919a9b87646c531f96e39e7217c11480b3118/datafusion/expr/src/logical_plan/plan.rs#L2858-L2859
   
   



##########
datafusion/expr/src/logical_plan/plan.rs:
##########
@@ -3870,6 +3873,39 @@ digraph {
         );
     }
 
+    #[test]

Review Comment:
   Could you also please add a sqllogictest? Perhaps by extending 
https://github.com/apache/datafusion/blob/main/datafusion/sqllogictest/test_files/distinct_on.slt
 ?



##########
datafusion/expr/src/logical_plan/plan.rs:
##########
@@ -3870,6 +3873,39 @@ digraph {
         );
     }
 
+    #[test]
+    fn distinct_on_expr_order_by_mismatch() -> Result<()> {
+        let schema = Schema::new(vec![
+            Field::new("a", DataType::Int32, false),
+            Field::new("b", DataType::Int32, false),
+        ]);
+
+        let table_scan = table_scan(TableReference::none(), &schema, 
None)?.build()?;
+        let p = DistinctOn::try_new(
+            vec![col("a")],
+            vec![],
+            Some(vec![SortExpr::new(col("b"), true, false)]),
+            Arc::new(table_scan),
+        );
+        assert_eq!(p.err().unwrap().strip_backtrace(), "Error during planning: 
SELECT DISTINCT ON expressions must match initial ORDER BY expressions");
+        Ok(())
+    }
+
+    #[test]
+    fn distinct_on_expr_order_by_match() -> Result<()> {

Review Comment:
   When I revert the code changes in this PR, these tests still pass 🤔 -- so I 
am not sure this PR solves the issue
   
   
   ```diff
   diff --git a/datafusion/expr/src/logical_plan/plan.rs 
b/datafusion/expr/src/logical_plan/plan.rs
   index 6f4e78ccf..9d7d893be 100644
   --- a/datafusion/expr/src/logical_plan/plan.rs
   +++ b/datafusion/expr/src/logical_plan/plan.rs
   @@ -28,8 +28,7 @@ use super::DdlStatement;
    use crate::builder::{change_redundant_column, unnest_with_options};
    use crate::expr::{Placeholder, Sort as SortExpr, WindowFunction};
    use crate::expr_rewriter::{
   -    create_col_from_scalar_expr, normalize_col, normalize_cols, 
normalize_sorts,
   -    NamePreserver,
   +    create_col_from_scalar_expr, normalize_cols, normalize_sorts, 
NamePreserver,
    };
    use crate::logical_plan::display::{GraphvizVisitor, IndentVisitor};
    use crate::logical_plan::extension::UserDefinedLogicalNode;
   @@ -2890,9 +2889,7 @@ impl DistinctOn {
            // Check that the left-most sort expressions are the same as the 
`ON` expressions.
            let mut matched = true;
            for (on, sort) in self.on_expr.iter().zip(sort_expr.iter()) {
   -            if normalize_col(on.clone(), &self.input)?
   -                != normalize_col(sort.expr.clone(), &self.input)?
   -            {
   +            if on != &sort.expr {
                    matched = false;
                    break;
                }
   ```
   
   
   ```shell
   (venv) andrewlamb@Andrews-MacBook-Pro-2:~/Software/datafusion$ cargo test -p 
datafusion-expr -- distinct_on
      Compiling datafusion-expr v42.1.0 
(/Users/andrewlamb/Software/datafusion/datafusion/expr)
   warning: unused variable: `p`
       --> datafusion/expr/src/logical_plan/plan.rs:3897:13
        |
   3897 |         let p = DistinctOn::try_new(
        |             ^ help: if this is intentional, prefix it with an 
underscore: `_p`
        |
        = note: `#[warn(unused_variables)]` on by default
   
   warning: `datafusion-expr` (lib test) generated 1 warning
       Finished `test` profile [unoptimized + debuginfo] target(s) in 2.99s
        Running unittests src/lib.rs 
(target/debug/deps/datafusion_expr-3be1ac638c9d6fb5)
   
   running 2 tests
   test logical_plan::plan::tests::distinct_on_expr_order_by_match ... ok
   test logical_plan::plan::tests::distinct_on_expr_order_by_mismatch ... ok
   
   test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 107 filtered 
out; finished in 0.00s
   
      Doc-tests datafusion_expr
   
   running 0 tests
   
   test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 45 filtered out; 
finished in 0.01s
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to