Jimexist commented on a change in pull request #631:
URL: https://github.com/apache/arrow-datafusion/pull/631#discussion_r659478843
##########
File path: datafusion/src/physical_plan/window_functions.rs
##########
@@ -208,11 +210,57 @@ pub(super) fn signature_for_built_in(fun:
&BuiltInWindowFunction) -> Signature {
}
}
+/// Partition evaluator
+pub(crate) trait PartitionEvaluator {
+ /// Whether the evaluator should be evaluated with rank
+ fn include_rank(&self) -> bool {
+ false
+ }
+
+ /// evaluate the partition evaluator against the partitions
+ fn evaluate(&self, partition_points: Vec<Range<usize>>) ->
Result<Vec<ArrayRef>> {
+ partition_points
+ .into_iter()
+ .map(|partition| self.evaluate_partition(partition))
+ .collect()
+ }
+
+ /// evaluate the partition evaluator against the partitions with rank
information
+ fn evaluate_with_rank(
+ &self,
+ partition_points: Vec<Range<usize>>,
+ sort_partition_points: Vec<Range<usize>>,
+ ) -> Result<Vec<ArrayRef>> {
+ partition_points
+ .into_iter()
+ .map(|partition| {
+ let ranks_in_partition =
+ find_ranges_in_range(&partition, &sort_partition_points);
+ self.evaluate_partition_with_rank(partition,
ranks_in_partition)
+ })
+ .collect()
+ }
+
+ /// evaluate the partition evaluator against the partition
+ fn evaluate_partition(&self, _partition: Range<usize>) -> Result<ArrayRef>;
Review comment:
having thought of this for a while, i think let's merge this as is.
when arrow 4.4 is released, the partition points is migrated to be an
iterator. at that time i can unify both functions and let the laziness do its
work (i.e. pass in the iterator in all cases, letting the consumer to decide).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]