adriangb commented on code in PR #15301:
URL: https://github.com/apache/datafusion/pull/15301#discussion_r2020153605


##########
datafusion/physical-expr-common/src/physical_expr.rs:
##########
@@ -283,6 +284,47 @@ pub trait PhysicalExpr: Send + Sync + Display + Debug + 
DynEq + DynHash {
     /// See the [`fmt_sql`] function for an example of printing 
`PhysicalExpr`s as SQL.
     ///
     fn fmt_sql(&self, f: &mut Formatter<'_>) -> fmt::Result;
+
+    /// Take a snapshot of this `PhysicalExpr` if it is dynamic.
+    /// This is used to capture the current state of `PhysicalExpr`s that may 
contain
+    /// dynamic references to other operators in order to serialize it over 
the wire
+    /// or treat it via downcast matching.
+    ///
+    /// You should not call this method directly as it does not handle 
recursion.
+    /// Instead use `shapshot_physical_expr` to handle recursion and capture 
the
+    /// full state of the `PhysicalExpr`.
+    ///
+    /// This is expected to return "simple" expressions that do not have 
mutable state
+    /// and are composed of DataFusion's built-in `PhysicalExpr` 
implementations.
+    /// Callers however should *not* assume anything about the returned 
expressions
+    /// since callers and implementers may not agree on what "simple" or 
"built-in"
+    /// means.
+    /// In other words, if you need to searlize a `PhysicalExpr` across the 
wire
+    /// you should call this method and then try to serialize the result,
+    /// but you should handle unknown or unexpected `PhysicalExpr` 
implementations gracefully
+    /// just as if you had not called this method at all.
+    ///
+    /// In particular, consider:
+    /// * A `PhysicalExpr` that references the current state of a 
`datafusion::physical_plan::TopK`
+    ///   that is involved in a query with `SELECT * FROM t1 ORDER BY a LIMIT 
10`.
+    ///   This function may return something like `a >= 12`.
+    /// * A `PhysicalExpr` that references the current state of a 
`datafusion::physical_plan::joins::HashJoinExec`
+    ///   from a query such as `SELECT * FROM t1 JOIN t2 ON t1.a = t2.b`.
+    ///   This function may return something like `t2.b IN (1, 5, 7)`.
+    ///
+    /// A system or function that can only deal with a hardcoded set of 
`PhysicalExpr` implementations
+    /// or needs to serialize this state to bytes may not be able to handle 
these dynamic references.
+    /// In such cases, we should return a simplified version of the 
`PhysicalExpr` that does not
+    /// contain these dynamic references.
+    ///
+    /// Note for implementers: this method should *not* handle recursion.
+    /// Recursion is handled in `shapshot_physical_expr`.
+    fn snapshot(&self) -> Result<Option<Arc<dyn PhysicalExpr>>> {

Review Comment:
   @alamb I was able to clean this up a lot (smaller diff, removed the 
`DynamicFilterSource` trait you didn't like) by adding this method to 
`PhysicalExpr`. If that's acceptable I think that will be the best solution.



##########
datafusion/physical-plan/src/dynamic_filters.rs:
##########
@@ -0,0 +1,226 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use std::{
+    any::Any,
+    hash::Hash,
+    sync::{Arc, RwLock},
+};
+
+use datafusion_common::{
+    tree_node::{Transformed, TransformedResult, TreeNode},
+    Result,
+};
+use datafusion_expr::ColumnarValue;
+use datafusion_physical_expr::{expressions::lit, utils::conjunction, 
PhysicalExpr};
+
+/// A source of dynamic runtime filters.
+///
+/// During query execution, operators implementing this trait can provide
+/// filter expressions that other operators can use to dynamically prune data.
+///
+/// See `TopKDynamicFilterSource` in datafusion/physical-plan/src/topk/mod.rs 
for examples.
+pub trait DynamicFilterSource: Send + Sync + std::fmt::Debug + 'static {

Review Comment:
   Well I didn't remove it, but it's not a private internal implementation 
detail, as is `DynamicFilterPhysicalExpr` 😄 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to