alamb commented on code in PR #5661:
URL: https://github.com/apache/arrow-datafusion/pull/5661#discussion_r1148371213
##########
datafusion/core/src/physical_plan/joins/symmetric_hash_join.rs:
##########
@@ -399,10 +393,14 @@ impl ExecutionPlan for SymmetricHashJoinExec {
self.schema.clone()
}
- fn required_input_ordering(&self) -> Vec<Option<&[PhysicalSortExpr]>> {
+ fn required_input_ordering(&self) ->
Vec<Option<Vec<PhysicalSortRequirement>>> {
Review Comment:
I wonder if it is time to make a structure or something --
`Vec<Option<Vec<..>>>` is starting to be a significant cognitive load to reason
about. Encapsulating into a structure like `struct InputOrderingRequirements`
might make it easier to reason about as well as make constructing this easier
##########
datafusion/core/src/physical_plan/joins/symmetric_hash_join.rs:
##########
@@ -399,10 +393,14 @@ impl ExecutionPlan for SymmetricHashJoinExec {
self.schema.clone()
}
- fn required_input_ordering(&self) -> Vec<Option<&[PhysicalSortExpr]>> {
+ fn required_input_ordering(&self) ->
Vec<Option<Vec<PhysicalSortRequirement>>> {
Review Comment:
I actually spent some time trying to test bringing this change to IOx and I
found this particular part quite challenging:
1. Figuring out `Vec<Option<Vec<..>>>` and how to
2. `make_sort_requirements_from_exprs` appears not to be public so I
couldn't use it.
If we are going to change the signature of these traits, I think we should
think about how make it easier to use.
For example, I wonder if we could make a structure like `Distribution` to
https://docs.rs/datafusion/20.0.0/datafusion/physical_plan/enum.Distribution.html
to describe an `Order`?
This would both
1. make it more convenient to construct these lists
2. provide a place to document what they are and all the subtlety
Perhaps something like
```rust
impl ExecutionPlan {
// the order required for the plans inputs
fn required_input_ordering(&self) -> Vec<Ordering> ;
// output order produced by this plan
fn output_ordering(&self) -> Ordering ;
}
```
And then encapsulate all the gory details about orderings and comparing them
in `Ordering`:
```rust
struct Ordering {
exprs: Vec<dyn PhysicalExpr>
// parallel list if there are any particular requirements on the sort
orders
required_options: Vec<SortOptions>
}
impl RequiredInputOrdering {
// construct a required input
pub fn from_sort_exprs(exprs: Vec<dyn PhysicalSortExpr>) -> Self {
..
}
/// does this represent any requirement on the input ordering
fn input_order_required(&self) -> bool {}
// Is there any NULLS FIRST / NULLS LAST
...
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]