alamb commented on code in PR #10061: URL: https://github.com/apache/arrow-datafusion/pull/10061#discussion_r1564030101
########## datafusion/core/src/physical_optimizer/convert_first_last.rs: ########## @@ -0,0 +1,294 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +use datafusion_common::Result; +use datafusion_common::{ + config::ConfigOptions, + tree_node::{Transformed, TransformedResult, TreeNode}, +}; +use datafusion_physical_expr::expressions::{FirstValue, LastValue}; +use datafusion_physical_expr::{ + equivalence::ProjectionMapping, reverse_order_bys, AggregateExpr, + EquivalenceProperties, PhysicalSortRequirement, +}; +use datafusion_physical_plan::aggregates::concat_slices; +use datafusion_physical_plan::{ + aggregates::{AggregateExec, AggregateMode}, + ExecutionPlan, ExecutionPlanProperties, InputOrderMode, +}; +use std::sync::Arc; + +use datafusion_physical_plan::windows::get_ordered_partition_by_indices; + +use super::PhysicalOptimizerRule; + +/// The optimizer rule check the ordering requirements of the aggregate expressions. +/// And convert between FIRST_VALUE and LAST_VALUE if possible. +/// For example, If we have an ascending values and we want LastValue from the descending requirement, +/// it is equivalent to FirstValue with the current ascending ordering. +/// +/// The concrete example is that, says we have values c1 with [1, 2, 3], which is an ascending order. +/// If we want LastValue(c1 order by desc), which is the first value of reversed c1 [3, 2, 1], +/// so we can convert the aggregate expression to FirstValue(c1 order by asc), +/// since the current ordering is already satisfied, it saves our time! +#[derive(Default)] +pub struct ConvertFirstLast {} + +impl ConvertFirstLast { + pub fn new() -> Self { + Self::default() + } +} + +impl PhysicalOptimizerRule for ConvertFirstLast { + fn optimize( + &self, + plan: Arc<dyn ExecutionPlan>, + _config: &ConfigOptions, + ) -> Result<Arc<dyn ExecutionPlan>> { + plan.transform_up(&get_common_requirement_of_aggregate_input) + .data() + } + + fn name(&self) -> &str { + "SimpleOrdering" Review Comment: I think this name should match the name of the structure -- that is `"ConvertFirstLast"` in this case ########## datafusion/core/src/physical_optimizer/convert_first_last.rs: ########## @@ -0,0 +1,294 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +use datafusion_common::Result; +use datafusion_common::{ + config::ConfigOptions, + tree_node::{Transformed, TransformedResult, TreeNode}, +}; +use datafusion_physical_expr::expressions::{FirstValue, LastValue}; +use datafusion_physical_expr::{ + equivalence::ProjectionMapping, reverse_order_bys, AggregateExpr, + EquivalenceProperties, PhysicalSortRequirement, +}; +use datafusion_physical_plan::aggregates::concat_slices; +use datafusion_physical_plan::{ + aggregates::{AggregateExec, AggregateMode}, + ExecutionPlan, ExecutionPlanProperties, InputOrderMode, +}; +use std::sync::Arc; + +use datafusion_physical_plan::windows::get_ordered_partition_by_indices; + +use super::PhysicalOptimizerRule; + +/// The optimizer rule check the ordering requirements of the aggregate expressions. +/// And convert between FIRST_VALUE and LAST_VALUE if possible. +/// For example, If we have an ascending values and we want LastValue from the descending requirement, +/// it is equivalent to FirstValue with the current ascending ordering. +/// +/// The concrete example is that, says we have values c1 with [1, 2, 3], which is an ascending order. +/// If we want LastValue(c1 order by desc), which is the first value of reversed c1 [3, 2, 1], +/// so we can convert the aggregate expression to FirstValue(c1 order by asc), +/// since the current ordering is already satisfied, it saves our time! +#[derive(Default)] +pub struct ConvertFirstLast {} Review Comment: I wonder if we could call this something more general, like `OptimizeAggregateOrder` so it could potentially be used for aggregates other than `FIRST_VALUE` and `LAST_VALUE` 🤔 ########## datafusion/core/src/physical_optimizer/convert_first_last.rs: ########## @@ -0,0 +1,294 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +use datafusion_common::Result; +use datafusion_common::{ + config::ConfigOptions, + tree_node::{Transformed, TransformedResult, TreeNode}, +}; +use datafusion_physical_expr::expressions::{FirstValue, LastValue}; +use datafusion_physical_expr::{ + equivalence::ProjectionMapping, reverse_order_bys, AggregateExpr, + EquivalenceProperties, PhysicalSortRequirement, +}; +use datafusion_physical_plan::aggregates::concat_slices; +use datafusion_physical_plan::{ + aggregates::{AggregateExec, AggregateMode}, + ExecutionPlan, ExecutionPlanProperties, InputOrderMode, +}; +use std::sync::Arc; + +use datafusion_physical_plan::windows::get_ordered_partition_by_indices; + +use super::PhysicalOptimizerRule; + +/// The optimizer rule check the ordering requirements of the aggregate expressions. +/// And convert between FIRST_VALUE and LAST_VALUE if possible. +/// For example, If we have an ascending values and we want LastValue from the descending requirement, +/// it is equivalent to FirstValue with the current ascending ordering. +/// +/// The concrete example is that, says we have values c1 with [1, 2, 3], which is an ascending order. +/// If we want LastValue(c1 order by desc), which is the first value of reversed c1 [3, 2, 1], +/// so we can convert the aggregate expression to FirstValue(c1 order by asc), +/// since the current ordering is already satisfied, it saves our time! +#[derive(Default)] +pub struct ConvertFirstLast {} + +impl ConvertFirstLast { + pub fn new() -> Self { + Self::default() + } +} + +impl PhysicalOptimizerRule for ConvertFirstLast { + fn optimize( + &self, + plan: Arc<dyn ExecutionPlan>, + _config: &ConfigOptions, + ) -> Result<Arc<dyn ExecutionPlan>> { + plan.transform_up(&get_common_requirement_of_aggregate_input) + .data() + } + + fn name(&self) -> &str { + "SimpleOrdering" + } + + fn schema_check(&self) -> bool { + true + } +} + +fn get_common_requirement_of_aggregate_input( + plan: Arc<dyn ExecutionPlan>, +) -> Result<Transformed<Arc<dyn ExecutionPlan>>> { + // Optimize children + let children = plan.children(); + let mut is_child_transformed = false; + let mut new_children: Vec<Arc<dyn ExecutionPlan>> = vec![]; + for c in children.iter() { + let res = optimize_internal(c.clone())?; + if res.transformed { + is_child_transformed = true; + } + new_children.push(res.data); + } + + // Update children if transformed + let plan = if is_child_transformed { + plan.with_new_children(new_children)? + } else { + plan + }; + + // Update itself + let plan = optimize_internal(plan)?; + + // If one of the children is transformed, then the plan is considered transformed, then we update + // the children of the plan from bottom to top. + if plan.transformed || is_child_transformed { + Ok(Transformed::yes(plan.data)) + } else { + Ok(Transformed::no(plan.data)) + } +} + +/// In `create_initial_plan` for LogicalPlan::Aggregate, we have a nested AggregateExec where the first layer Review Comment: thank you for this comment. It makes things much clearer ########## datafusion/core/src/physical_optimizer/convert_first_last.rs: ########## @@ -0,0 +1,294 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +use datafusion_common::Result; +use datafusion_common::{ + config::ConfigOptions, + tree_node::{Transformed, TransformedResult, TreeNode}, +}; +use datafusion_physical_expr::expressions::{FirstValue, LastValue}; +use datafusion_physical_expr::{ + equivalence::ProjectionMapping, reverse_order_bys, AggregateExpr, + EquivalenceProperties, PhysicalSortRequirement, +}; +use datafusion_physical_plan::aggregates::concat_slices; +use datafusion_physical_plan::{ + aggregates::{AggregateExec, AggregateMode}, + ExecutionPlan, ExecutionPlanProperties, InputOrderMode, +}; +use std::sync::Arc; + +use datafusion_physical_plan::windows::get_ordered_partition_by_indices; + +use super::PhysicalOptimizerRule; + +/// The optimizer rule check the ordering requirements of the aggregate expressions. +/// And convert between FIRST_VALUE and LAST_VALUE if possible. +/// For example, If we have an ascending values and we want LastValue from the descending requirement, +/// it is equivalent to FirstValue with the current ascending ordering. +/// +/// The concrete example is that, says we have values c1 with [1, 2, 3], which is an ascending order. +/// If we want LastValue(c1 order by desc), which is the first value of reversed c1 [3, 2, 1], +/// so we can convert the aggregate expression to FirstValue(c1 order by asc), +/// since the current ordering is already satisfied, it saves our time! +#[derive(Default)] +pub struct ConvertFirstLast {} + +impl ConvertFirstLast { + pub fn new() -> Self { + Self::default() + } +} + +impl PhysicalOptimizerRule for ConvertFirstLast { + fn optimize( + &self, + plan: Arc<dyn ExecutionPlan>, + _config: &ConfigOptions, + ) -> Result<Arc<dyn ExecutionPlan>> { + plan.transform_up(&get_common_requirement_of_aggregate_input) + .data() + } + + fn name(&self) -> &str { + "SimpleOrdering" + } + + fn schema_check(&self) -> bool { + true + } +} + +fn get_common_requirement_of_aggregate_input( + plan: Arc<dyn ExecutionPlan>, +) -> Result<Transformed<Arc<dyn ExecutionPlan>>> { + // Optimize children Review Comment: since this rule already calls `transform_up` which handles the recursion up the tree of `ExecutionPlan` and managine the `transformed`flag, I don't think you also need to recursively walk down the children here again. I think you can probably just call `optimize_internal` directly Recursing back down the tree is also like N^2 (or worse) in the number of plan nodes so I think we should avoid it for performance reasons (in addition to making the code simpler) ########## datafusion/core/src/physical_optimizer/convert_first_last.rs: ########## @@ -0,0 +1,294 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +use datafusion_common::Result; +use datafusion_common::{ + config::ConfigOptions, + tree_node::{Transformed, TransformedResult, TreeNode}, +}; +use datafusion_physical_expr::expressions::{FirstValue, LastValue}; +use datafusion_physical_expr::{ + equivalence::ProjectionMapping, reverse_order_bys, AggregateExpr, + EquivalenceProperties, PhysicalSortRequirement, +}; +use datafusion_physical_plan::aggregates::concat_slices; +use datafusion_physical_plan::{ + aggregates::{AggregateExec, AggregateMode}, + ExecutionPlan, ExecutionPlanProperties, InputOrderMode, +}; +use std::sync::Arc; + +use datafusion_physical_plan::windows::get_ordered_partition_by_indices; + +use super::PhysicalOptimizerRule; + +/// The optimizer rule check the ordering requirements of the aggregate expressions. +/// And convert between FIRST_VALUE and LAST_VALUE if possible. +/// For example, If we have an ascending values and we want LastValue from the descending requirement, +/// it is equivalent to FirstValue with the current ascending ordering. +/// +/// The concrete example is that, says we have values c1 with [1, 2, 3], which is an ascending order. +/// If we want LastValue(c1 order by desc), which is the first value of reversed c1 [3, 2, 1], +/// so we can convert the aggregate expression to FirstValue(c1 order by asc), +/// since the current ordering is already satisfied, it saves our time! +#[derive(Default)] +pub struct ConvertFirstLast {} + +impl ConvertFirstLast { + pub fn new() -> Self { + Self::default() + } +} + +impl PhysicalOptimizerRule for ConvertFirstLast { + fn optimize( + &self, + plan: Arc<dyn ExecutionPlan>, + _config: &ConfigOptions, + ) -> Result<Arc<dyn ExecutionPlan>> { + plan.transform_up(&get_common_requirement_of_aggregate_input) + .data() + } + + fn name(&self) -> &str { + "SimpleOrdering" + } + + fn schema_check(&self) -> bool { + true + } +} + +fn get_common_requirement_of_aggregate_input( + plan: Arc<dyn ExecutionPlan>, +) -> Result<Transformed<Arc<dyn ExecutionPlan>>> { + // Optimize children + let children = plan.children(); + let mut is_child_transformed = false; + let mut new_children: Vec<Arc<dyn ExecutionPlan>> = vec![]; + for c in children.iter() { + let res = optimize_internal(c.clone())?; + if res.transformed { + is_child_transformed = true; + } + new_children.push(res.data); + } + + // Update children if transformed + let plan = if is_child_transformed { + plan.with_new_children(new_children)? + } else { + plan + }; + + // Update itself + let plan = optimize_internal(plan)?; + + // If one of the children is transformed, then the plan is considered transformed, then we update + // the children of the plan from bottom to top. + if plan.transformed || is_child_transformed { + Ok(Transformed::yes(plan.data)) + } else { + Ok(Transformed::no(plan.data)) + } +} + +/// In `create_initial_plan` for LogicalPlan::Aggregate, we have a nested AggregateExec where the first layer +/// is in Partial mode and the second layer is in Final or Finalpartitioned mode. +/// If the first layer of aggregate plan is transformed, we need to update the child of the layer with final mode. +/// Therefore, we check it and get the updated aggregate expressions. +/// +/// If AggregateExec is created from elsewhere, we skip the check and return the original aggregate expressions. +fn try_get_updated_aggr_expr_from_child( + aggr_exec: &AggregateExec, +) -> Vec<Arc<dyn AggregateExpr>> { + let input = aggr_exec.input(); + if aggr_exec.mode() == &AggregateMode::Final + || aggr_exec.mode() == &AggregateMode::FinalPartitioned + { + // Some aggregators may be modified during initialization for + // optimization purposes. For example, a FIRST_VALUE may turn + // into a LAST_VALUE with the reverse ordering requirement. + // To reflect such changes to subsequent stages, use the updated + // `AggregateExpr`/`PhysicalSortExpr` objects. + // + // The bottom up transformation is the mirror of LogicalPlan::Aggregate creation in [create_initial_plan] + if let Some(c_aggr_exec) = input.as_any().downcast_ref::<AggregateExec>() { + if c_aggr_exec.mode() == &AggregateMode::Partial { + // If the input is an AggregateExec in Partial mode, then the + // input is a CoalescePartitionsExec. In this case, the + // AggregateExec is the second stage of aggregation. The + // requirements of the second stage are the requirements of + // the first stage. + return c_aggr_exec.aggr_expr().to_vec(); + } + } + } + + aggr_exec.aggr_expr().to_vec() +} + +fn optimize_internal( + plan: Arc<dyn ExecutionPlan>, +) -> Result<Transformed<Arc<dyn ExecutionPlan>>> { + if let Some(aggr_exec) = plan.as_any().downcast_ref::<AggregateExec>() { + let input = aggr_exec.input(); + let mut aggr_expr = try_get_updated_aggr_expr_from_child(aggr_exec); + let group_by = aggr_exec.group_by(); + let mode = aggr_exec.mode(); + + let input_eq_properties = input.equivalence_properties(); + let groupby_exprs = group_by.input_exprs(); + // If existing ordering satisfies a prefix of the GROUP BY expressions, + // prefix requirements with this section. In this case, aggregation will + // work more efficiently. + let indices = get_ordered_partition_by_indices(&groupby_exprs, input); + let requirement = indices + .iter() + .map(|&idx| PhysicalSortRequirement { + expr: groupby_exprs[idx].clone(), + options: None, + }) + .collect::<Vec<_>>(); + + try_convert_first_last_if_better( + &requirement, + &mut aggr_expr, + input_eq_properties, + )?; + + let required_input_ordering = (!requirement.is_empty()).then_some(requirement); + + let input_order_mode = + if indices.len() == groupby_exprs.len() && !indices.is_empty() { + InputOrderMode::Sorted + } else if !indices.is_empty() { + InputOrderMode::PartiallySorted(indices) + } else { + InputOrderMode::Linear + }; + let projection_mapping = + ProjectionMapping::try_new(group_by.expr(), &input.schema())?; + + let cache = AggregateExec::compute_properties( + input, + plan.schema().clone(), + &projection_mapping, + mode, + &input_order_mode, + ); + + let aggr_exec = aggr_exec.new_with_aggr_expr_and_ordering_info( + required_input_ordering, + aggr_expr, + cache, + input_order_mode, + ); + + Ok(Transformed::yes( + Arc::new(aggr_exec) as Arc<dyn ExecutionPlan> + )) + } else { + Ok(Transformed::no(plan)) + } +} + +/// Get the common requirement that satisfies all the aggregate expressions. +/// +/// # Parameters +/// +/// - `aggr_exprs`: A slice of `Arc<dyn AggregateExpr>` containing all the +/// aggregate expressions. +/// - `group_by`: A reference to a `PhysicalGroupBy` instance representing the +/// physical GROUP BY expression. +/// - `eq_properties`: A reference to an `EquivalenceProperties` instance +/// representing equivalence properties for ordering. +/// - `agg_mode`: A reference to an `AggregateMode` instance representing the +/// mode of aggregation. +/// +/// # Returns +/// +/// A `LexRequirement` instance, which is the requirement that satisfies all the +/// aggregate requirements. Returns an error in case of conflicting requirements. +/// +/// Similar to the one in datafusion/physical-plan/src/aggregates/mod.rs, but this +/// function care only the possible conversion between FIRST_VALUE and LAST_VALUE +fn try_convert_first_last_if_better( + prefix_requirement: &[PhysicalSortRequirement], + aggr_exprs: &mut [Arc<dyn AggregateExpr>], + eq_properties: &EquivalenceProperties, +) -> Result<()> { + for aggr_expr in aggr_exprs.iter_mut() { + let aggr_req = aggr_expr.order_bys().unwrap_or(&[]); + let reverse_aggr_req = reverse_order_bys(aggr_req); + let aggr_req = PhysicalSortRequirement::from_sort_exprs(aggr_req); + let reverse_aggr_req = + PhysicalSortRequirement::from_sort_exprs(&reverse_aggr_req); + + if let Some(first_value) = aggr_expr.as_any().downcast_ref::<FirstValue>() { Review Comment: Eventually (some other PR) it would be amazing if we can move this code into `FirstValue` somehow. As it is now, there is a coupling between the optimizer rule and the actual `PhysicalExpr` -- which means among other things this same optimization can't be used by user defined aggregates -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
