adriangb commented on code in PR #15301:
URL: https://github.com/apache/datafusion/pull/15301#discussion_r2020157010


##########
datafusion/proto/src/physical_plan/to_proto.rs:
##########
@@ -210,6 +212,7 @@ pub fn serialize_physical_expr(
     value: &Arc<dyn PhysicalExpr>,
     codec: &dyn PhysicalExtensionCodec,
 ) -> Result<protobuf::PhysicalExprNode> {
+    let value = snasphot_physical_expr(value.clone())?;

Review Comment:
   This handles serialization.
   
   The story for serialization (cc @XiangpengHao because this is relevant for 
LiquidCache) is that the naive thing will work out of the box: if a custom 
`DataSource` calls `serialize_physical_expr` in it's `open` method it will get 
a static version of the dynamic filter that won't update anymore.
   This already gets it stats pruning and _static_ predicate pushdown pruning.
   Note that this does have to happen in the `open` method: if it happens 
before that, at planning time, then dynamic filter pushdown won't work at all. 
Nothing will break, but there will be no perf benefit.
   If `serialize_physical_expr` is never called again things won't break: the 
serialization functionality will just get a PhysicalExpr it probably doesn't 
recognize, but it kind of has to handle these cases already.
   If it wants _dynamic_ predicate pushdown pruning the client side would need 
to call `serialize_physical_expr` after each batch is processed and broadcast / 
send an updated expression to the server if it changes.'
   
   I think this is a good outcome: it's easy to get 90% of the performance gain 
and the last 10% had to be customized to each case anyway but at least there's 
a good story for implementing that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to