alamb commented on code in PR #8034:
URL: https://github.com/apache/arrow-datafusion/pull/8034#discussion_r1382365034
##########
datafusion/physical-expr/src/equivalence.rs:
##########
@@ -20,26 +20,114 @@ use std::hash::Hash;
use std::sync::Arc;
use crate::expressions::Column;
-use crate::physical_expr::{deduplicate_physical_exprs, have_common_entries};
use crate::sort_properties::{ExprOrdering, SortProperties};
use crate::{
- physical_exprs_contains, LexOrdering, LexOrderingRef, LexRequirement,
- LexRequirementRef, PhysicalExpr, PhysicalSortExpr, PhysicalSortRequirement,
+ physical_exprs_bag_equal, physical_exprs_contains, physical_exprs_equal,
LexOrdering,
+ LexOrderingRef, LexRequirement, LexRequirementRef, PhysicalExpr,
PhysicalSortExpr,
+ PhysicalSortRequirement,
};
use arrow::datatypes::SchemaRef;
use arrow_schema::SortOptions;
use datafusion_common::tree_node::{Transformed, TreeNode};
use datafusion_common::{JoinSide, JoinType, Result};
+use crate::physical_expr::deduplicate_physical_exprs;
use indexmap::map::Entry;
use indexmap::IndexMap;
/// An `EquivalenceClass` is a set of [`Arc<dyn PhysicalExpr>`]s that are known
/// to have the same value for all tuples in a relation. These are generated by
-/// equality predicates, typically equi-join conditions and equality conditions
-/// in filters.
-pub type EquivalenceClass = Vec<Arc<dyn PhysicalExpr>>;
+/// equality predicates (e.g. `a = b`), typically equi-join conditions and
+/// equality conditions in filters.
+#[derive(Debug, Clone)]
+pub struct EquivalenceClass {
Review Comment:
> You seem to be keeping the lower-level functional primitives around, which
is also good -- we can reuse them to create encapsulations like
EquivalenceClass as we build more.
I did this mostly to keep the size of the initial diff down to make the
proposal easier to review.
It seems to me like `EquivalenceClass` has very few functions actually
related to equivalence calculations -- it is mostly a container of
`PhysicalExpr`s -- maybe it would be better named something like
`PhysicalExprList` 🤔 But then the equivalence calculations would be less
readable perhaps.
##########
datafusion/physical-expr/src/equivalence.rs:
##########
@@ -20,26 +20,114 @@ use std::hash::Hash;
use std::sync::Arc;
use crate::expressions::Column;
-use crate::physical_expr::{deduplicate_physical_exprs, have_common_entries};
use crate::sort_properties::{ExprOrdering, SortProperties};
use crate::{
- physical_exprs_contains, LexOrdering, LexOrderingRef, LexRequirement,
- LexRequirementRef, PhysicalExpr, PhysicalSortExpr, PhysicalSortRequirement,
+ physical_exprs_bag_equal, physical_exprs_contains, physical_exprs_equal,
LexOrdering,
+ LexOrderingRef, LexRequirement, LexRequirementRef, PhysicalExpr,
PhysicalSortExpr,
+ PhysicalSortRequirement,
};
use arrow::datatypes::SchemaRef;
use arrow_schema::SortOptions;
use datafusion_common::tree_node::{Transformed, TreeNode};
use datafusion_common::{JoinSide, JoinType, Result};
+use crate::physical_expr::deduplicate_physical_exprs;
use indexmap::map::Entry;
use indexmap::IndexMap;
/// An `EquivalenceClass` is a set of [`Arc<dyn PhysicalExpr>`]s that are known
/// to have the same value for all tuples in a relation. These are generated by
-/// equality predicates, typically equi-join conditions and equality conditions
-/// in filters.
-pub type EquivalenceClass = Vec<Arc<dyn PhysicalExpr>>;
+/// equality predicates (e.g. `a = b`), typically equi-join conditions and
+/// equality conditions in filters.
+#[derive(Debug, Clone)]
+pub struct EquivalenceClass {
Review Comment:
> You seem to be keeping the lower-level functional primitives around, which
is also good -- we can reuse them to create encapsulations like
EquivalenceClass as we build more.
I did this mostly to keep the size of the initial diff down to make the
proposal easier to review.
It seems to me like `EquivalenceClass` has very few functions actually
related to equivalence calculations -- it is mostly a container of
`PhysicalExpr`s -- maybe it would be better named something like
`PhysicalExprList` 🤔 But then the equivalence calculations would be less
readable perhaps.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]