alamb commented on code in PR #6592:
URL: https://github.com/apache/arrow-datafusion/pull/6592#discussion_r1222156771


##########
datafusion/physical-expr/src/window/mod.rs:
##########
@@ -32,6 +32,7 @@ mod window_frame_state;
 pub use aggregate::PlainAggregateWindowExpr;
 pub use built_in::BuiltInWindowExpr;
 pub use built_in_window_function_expr::BuiltInWindowFunctionExpr;
+pub use partition_evaluator::PartitionEvaluator;

Review Comment:
   `PartitionEvaluator` now appears in public docstrings (and I propose to make 
it part of the user defined window function API -- see 
https://github.com/apache/arrow-datafusion/issues/5781#issuecomment-1581428228)



##########
datafusion/physical-expr/src/window/built_in_window_function_expr.rs:
##########
@@ -69,6 +73,8 @@ pub trait BuiltInWindowFunctionExpr: Send + Sync + 
std::fmt::Debug {
         false
     }
 
+    /// If returns true, [`Self::create_evaluator`] must implement

Review Comment:
   I was a little unclear on why `use_window_frame` was part of 
`BuiltInWindowFunctionExpr` and not `PartitionEvaluator` but I haven't looked 
into it in more detail



##########
datafusion/physical-expr/src/window/sliding_aggregate.rs:
##########
@@ -36,12 +36,10 @@ use crate::{
     expressions::PhysicalSortExpr, reverse_order_bys, AggregateExpr, 
PhysicalExpr,
 };
 
-/// A window expr that takes the form of an aggregate function
-/// Aggregate Window Expressions that have the form
-/// `OVER({ROWS | RANGE| GROUPS} BETWEEN UNBOUNDED PRECEDING AND ...)`
-/// e.g cumulative window frames uses `PlainAggregateWindowExpr`. Where as 
Aggregate Window Expressions
-/// that have the form `OVER({ROWS | RANGE| GROUPS} BETWEEN M {PRECEDING| 
FOLLOWING} AND ...)`
-/// e.g sliding window frames uses `SlidingAggregateWindowExpr`.
+/// A window expr that takes the form of an aggregate function that
+/// can be incrementally computed over sliding windows.
+///
+/// See comments on [`WindowExpr`] for more details.

Review Comment:
   Consolidated this description into `WindowExpr`



##########
datafusion/physical-expr/src/window/partition_evaluator.rs:
##########
@@ -25,24 +25,70 @@ use datafusion_common::{DataFusionError, ScalarValue};
 use std::fmt::Debug;
 use std::ops::Range;
 
-/// Partition evaluator
+/// Partition evaluator for Window Functions
+///
+/// An implementation of this trait is created and used for each
+/// partition defined by the OVER clause.
+///
+/// For example, evaluating `window_func(val) OVER (PARTITION BY col)`
+/// on the following data:
+///
+/// ```text
+/// col | val
+/// --- + ----
+///  A  | 1
+///  A  | 1
+///  C  | 2
+///  D  | 3
+///  D  | 3
+/// ```
+///
+/// Will instantiate three `PartitionEvaluator`s, one each for the
+/// partitions defined by `col=A`, `col=B`, and `col=C`.
+///
+/// There are two types of `PartitionEvaluator`:
+///
+/// # Stateless `PartitionEvaluator`
+///

Review Comment:
   @mustafasrepo  / @ozankabak  if you have time to help me describe more 
clearly what Stateful and Stateless `PartitionEvaluator`s are , and 
specifically what is different between them, I would be most appreciative



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to