alamb commented on code in PR #11797:
URL: https://github.com/apache/datafusion/pull/11797#discussion_r1706039120


##########
datafusion/expr/src/udf.rs:
##########
@@ -345,12 +351,22 @@ pub trait ScalarUDFImpl: Debug + Send + Sync {
     fn name(&self) -> &str;
 
     /// Returns the user-defined display name of the UDF given the arguments
-    ///
     fn display_name(&self, args: &[Expr]) -> Result<String> {
-        let names: Vec<String> = 
args.iter().map(create_name).collect::<Result<_>>()?;
+        let names: Vec<String> = 
args.iter().map(ToString::to_string).collect();
+        // TODO: join with ", " to standardize the formatting of Vec<Expr>, 
<https://github.com/apache/datafusion/issues/10364>
         Ok(format!("{}({})", self.name(), names.join(",")))
     }
 
+    /// Returns the user-defined schema name of the UDF given the arguments

Review Comment:
   ```suggestion
       /// Returns the name of the column this expression would create
       ///
       /// See [`Expr::schema_name`] for details
   ```
   
   Maybe it is also worth doing the `impl Display` thing here too which would 
allow avoiding these string allocations



##########
datafusion/expr/src/expr.rs:
##########
@@ -983,10 +983,290 @@ impl PartialOrd for Expr {
 }
 
 impl Expr {
-    /// Returns the name of this expression as it should appear in a schema. 
This name
-    /// will not include any CAST expressions.
-    pub fn display_name(&self) -> Result<String> {
-        create_name(self)
+    /// Returns the name for schema / field that is different from Display
+    /// Most of the expressions are the same as Display
+    /// Those are the main difference
+    /// 1. Alias, where excludes expression
+    /// 2. Cast / TryCast, where takes expression only

Review Comment:
   Here is a suggesiton for how to improve this documentation:
   
   ```suggestion
       /// The name of the column (field) that this `Expr` will produce.
       ///
       /// For example, for a projection (e.g. `SELECT <expr>`) the resulting 
arrow
       /// [`Schema`] will have a field with this name.
       ///
       /// Note that the resulting string is subtlety different than the 
`Display`
       /// representation for certain `Expr`. Some differences:
       ///
       /// 1. [`Expr::Alias`], which shows only the alias itself
       /// 2. [`Expr::Cast`] / [`Expr::TryCast`], which only displays the 
expression
       ///
       /// # Example
       /// ```
       /// # use datafusion_expr::{col, lit};
       /// let expr = col("foo").eq(lit(42));
       /// assert_eq!("foo = Int32(42)", expr.schema_name().unwrap());
       ///
       /// let expr = col("foo").alias("bar").eq(lit(11));
       /// assert_eq!("bar = Int32(11)", expr.schema_name().unwrap());
       /// ```
       ///
       /// [`Schema`]: arrow::datatypes::Schema
   ```



##########
datafusion/expr/src/expr.rs:
##########
@@ -983,10 +983,290 @@ impl PartialOrd for Expr {
 }
 
 impl Expr {
-    /// Returns the name of this expression as it should appear in a schema. 
This name
-    /// will not include any CAST expressions.
-    pub fn display_name(&self) -> Result<String> {
-        create_name(self)
+    /// Returns the name for schema / field that is different from Display
+    /// Most of the expressions are the same as Display
+    /// Those are the main difference
+    /// 1. Alias, where excludes expression
+    /// 2. Cast / TryCast, where takes expression only
+    pub fn schema_name(&self) -> Result<String> {

Review Comment:
   If we are going to introduce a new API anyways, what do you think about 
making an API that doesn't *require* allocating a new string and making it 
infallable. Something like
   
   ```suggestion
       pub fn schema_name<'a>(&'a self) -> impl Display + 'a {
   ```
   
   We could probably then implement a new type thing and mostly use the same 
code:
   
   ```rust
   // new type wrapper
   struct SchemaDisplay<'a>(&'a Expr);
   
   impl <'a> Display for SchemaDisplay<'a> {
     fn fmt(&self, f: &mut Formatter<'_>) -> fmt::Result {
           match self {
               // The same as Display
               Expr::Column(_)
               | Expr::Literal(_)
               | Expr::ScalarVariable(..)
               | Expr::Sort(_)
               | Expr::OuterReferenceColumn(..)
               | Expr::Placeholder(_)
               | Expr::Wildcard { .. } => write!(f, "{self}")?,
   ...
     }
   }
   
   return SchemaDisplay(self)
   ```
   
   
   The benefit of doing this would be anywhere it needed to get formatted would 
not require a copy
   
   ```rust
   println!("This would not allocated a String: {}", expr.schema_name());
   ```
   
   ```rust
   // get schema name as a string:
   let schema_name = expr.schema_name().to_string();
   



##########
datafusion/substrait/tests/cases/consumer_integration.rs:
##########
@@ -358,8 +358,8 @@ mod tests {
 
         let plan = from_substrait_plan(&ctx, &proto).await?;
         let plan_str = format!("{:?}", plan);
-        assert_eq!(plan_str, "Projection: Decimal128(Some(10000),5,2) * 
sum(CASE WHEN FILENAME_PLACEHOLDER_1.p_type LIKE CAST(Utf8(\"PROMO%\") AS Utf8) 
THEN FILENAME_PLACEHOLDER_0.l_extendedprice * Int32(1) - 
FILENAME_PLACEHOLDER_0.l_discount ELSE Decimal128(Some(0),19,0) END) / 
sum(FILENAME_PLACEHOLDER_0.l_extendedprice * Int32(1) - 
FILENAME_PLACEHOLDER_0.l_discount) AS PROMO_REVENUE\
-        \n  Aggregate: groupBy=[[]], aggr=[[sum(CASE WHEN 
FILENAME_PLACEHOLDER_1.p_type LIKE CAST(Utf8(\"PROMO%\") AS Utf8) THEN 
FILENAME_PLACEHOLDER_0.l_extendedprice * Int32(1) - 
FILENAME_PLACEHOLDER_0.l_discount ELSE Decimal128(Some(0),19,0) END), 
sum(FILENAME_PLACEHOLDER_0.l_extendedprice * Int32(1) - 
FILENAME_PLACEHOLDER_0.l_discount)]]\
+        assert_eq!(plan_str, "Projection: Decimal128(Some(10000),5,2) * 
sum(CASE WHEN FILENAME_PLACEHOLDER_1.p_type LIKE Utf8(\"PROMO%\") THEN 
FILENAME_PLACEHOLDER_0.l_extendedprice * Int32(1) - 
FILENAME_PLACEHOLDER_0.l_discount ELSE Decimal128(Some(0),19,0) END) / 
sum(FILENAME_PLACEHOLDER_0.l_extendedprice * Int32(1) - 
FILENAME_PLACEHOLDER_0.l_discount) AS PROMO_REVENUE\

Review Comment:
   I think it makes sense as long as the cast is still in plans (which it is)



##########
datafusion/expr/src/expr.rs:
##########
@@ -983,10 +983,290 @@ impl PartialOrd for Expr {
 }
 
 impl Expr {
-    /// Returns the name of this expression as it should appear in a schema. 
This name

Review Comment:
   Since this is a pub function what do you think about leaving it in marked as 
deprecated to help migration?
   
   I think this make the PR much easier to handle
   
   Something like
   
   ```rust
       #[deprecated(since = "31.0.0", note = "use data_type instead")]
       pub fn display_name(&self) -> Result<String> {
         self.schema_name()
       }
   ```
   



##########
datafusion/sqllogictest/test_files/array.slt:
##########
@@ -1137,7 +1137,7 @@ from arrays_values_without_nulls;
 ## array_element (aliases: array_extract, list_extract, list_element)
 
 # Testing with empty arguments should result in an error
-query error DataFusion error: Error during planning: Error during planning: 
array_element does not support zero arguments.
+query error DataFusion error: Execution error: expect 2 args, got 0

Review Comment:
   the old error message was nicer -- maybe we can update the code that 
generates this message to be nicer too 🤔 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to