adriangb commented on code in PR #15779:
URL: https://github.com/apache/datafusion/pull/15779#discussion_r2057512794


##########
datafusion/physical-expr-common/src/physical_expr.rs:
##########
@@ -333,6 +333,15 @@ pub trait PhysicalExpr: Send + Sync + Display + Debug + 
DynEq + DynHash {
         // This is a safe default behavior.
         Ok(None)
     }
+
+    /// Adapt this [`PhysicalExpr`] to a new schema.
+    /// For example, `Column("b", 1)` can be adapted to `Column("b", 0)`
+    /// given the schema `Schema::new(vec![("b", DataType::Int32)])`.
+    fn with_schema(&self, _schema: &Schema) -> Result<Option<Arc<dyn 
PhysicalExpr>>> {

Review Comment:
   > Since during the physical planning duplicate names are possible, a column 
shouldn't be used in a place apart from its reference schema.
   
   Ah good point. It's unclear to me where duplicate columns are allowed and 
where they aren't. E.g. `DFSchema` does not allow duplicate columns IIRC.
   
   > There should be an API at the ExecutionPlan level, like 
update_expression(&self, expr: Arc, down: bool) -> Option<Arc>
   
   There's some missing types. Do you mean:
   
   ```rust
   trait ExecutionPlan {
     fn update_expression(&self, expr: Arc<dyn PhysicalExpr>, down: bool) -> 
Option<Arc<dyn PhysicalExpr>>>;
   }
   ```
   
   ?
   
   I don't fully follow. What is this function expected to do for different 
operators, and how does it behave differently across operators? It would help 
me if you shared the larger vision of how this gets used for this PRs use case, 
in other places, etc. The code proposed in this PR is being used in places 
where there is no `ExecutionPlan` (e.g. in parquet pruning within 
`ParquetOpener`).
   
   Tangential but I generally don't like the idea of having a `bool` parameter 
like that and would rather it be two functions 
`update_expression_for_input_schema` and `update_expression_for_output_schema` 
or something like that.
   
   Back on topic: how is this different than say 
`PhysicalExpression::with_schema(expr, ExecutionPlan::schema(plan))`?
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to