berkaysynnada commented on code in PR #6566:
URL: https://github.com/apache/arrow-datafusion/pull/6566#discussion_r1236971325


##########
datafusion/sql/src/select.rs:
##########
@@ -194,7 +198,13 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
                 .iter()
                 .map(|expr| rebase_expr(expr, &window_func_exprs, &plan))
                 .collect::<Result<Vec<Expr>>>()?;
-
+            if select.into.is_some() {
+                for expr in select_exprs_post_aggr.iter_mut() {
+                    if let Expr::Column(_) = expr.clone() {
+                        *expr = expr.clone().alias(physical_name(expr)?);
+                    }
+                }
+            }

Review Comment:
   Sorry for the late reply. I have tried both suggestions @alamb:
   1) We need to modify 3 parts of the code: 
   a- In the `project()` function in select.rs, the final schema will be 
constructed with a shortened form of the window function. 
   ```
       fn to_field(&self, input_schema: &DFSchema) -> Result<DFField> {
           match self {
               Expr::Column(c) => Ok(DFField::new(
                   c.relation.clone(),
                   &c.name,
                   self.get_type(input_schema)?,
                   self.nullable(input_schema)?,
               )),
               _ => {
                   Ok(DFField::new_unqualified(
                       &self.display_name()?,
                       self.get_type(input_schema)?,
                       self.nullable(input_schema)?,
                   ))
               }
           }
       }
   ```
   is expanded with that arm:
                   
   ```
             Expr::WindowFunction(WindowFunction { fun, args, .. }) => {
                   Ok(DFField::new_unqualified(
                       &vec![create_function_name(&fun.to_string(), false, 
args)?].join(" "),
                       self.get_type(input_schema)?,
                       self.nullable(input_schema)?,
                   ))
               }
   ```
   b- In the `project()` function again, qualified wildcard columns are 
normalized. However, the column name is in the longer form, and the schema of 
the plan is in the shorter form. Therefore, we also change the 
`expr_as_column_expr()` function so that window function expressions are 
converted to column expressions with the shortened column name format, which 
can be copied from the schema.
   c- `PushDownProjection` rule again creates new column expressions with 
`display_name() `function (which returns the long format) in the window 
handling arm. These column names also need to be shortened to satisfy 
subsequent control.
   
   **My opinion:** Can we directly change the `display_name()` function for 
window functions such that only function name and arguments are returned? Thus 
we don't need to change any of what I mentioned above.
   ```
           Expr::WindowFunction(WindowFunction {
               fun, args, window_frame, partition_by, order_by,
           }) => {
               let mut parts: Vec<String> =
                   vec![create_function_name(&fun.to_string(), false, args)?];
               if !partition_by.is_empty() {
                   parts.push(format!("PARTITION BY {partition_by:?}"));
               }
               if !order_by.is_empty() {
                   parts.push(format!("ORDER BY {order_by:?}"));
               }
               parts.push(format!("{window_frame}"));
               Ok(parts.join(" "))
           }
   ```
   new version:
   ```
   Expr::WindowFunction(WindowFunction {
               fun, args, ..
           }) => {
               let mut parts: Vec<String> =
                   vec![create_function_name(&fun.to_string(), false, args)?];
               Ok(parts.join(" "))
           }
   ```
   
   
   2) We can only change `create_window_expr()` such that it creates the window 
name with `display_name()` rather than `physical_name()`, but it makes the 
plans longer, also lots of test change burden.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to