alamb commented on code in PR #5661:
URL: https://github.com/apache/arrow-datafusion/pull/5661#discussion_r1148371213


##########
datafusion/core/src/physical_plan/joins/symmetric_hash_join.rs:
##########
@@ -399,10 +393,14 @@ impl ExecutionPlan for SymmetricHashJoinExec {
         self.schema.clone()
     }
 
-    fn required_input_ordering(&self) -> Vec<Option<&[PhysicalSortExpr]>> {
+    fn required_input_ordering(&self) -> 
Vec<Option<Vec<PhysicalSortRequirement>>> {

Review Comment:
   I wonder if it is time to make a structure or something -- 
`Vec<Option<Vec<..>>>` is starting to be a significant cognitive load to reason 
about. Encapsulating into a structure like `struct InputOrderingRequirements` 
might make it easier to reason about as well as make constructing this easier



##########
datafusion/core/src/physical_plan/joins/symmetric_hash_join.rs:
##########
@@ -399,10 +393,14 @@ impl ExecutionPlan for SymmetricHashJoinExec {
         self.schema.clone()
     }
 
-    fn required_input_ordering(&self) -> Vec<Option<&[PhysicalSortExpr]>> {
+    fn required_input_ordering(&self) -> 
Vec<Option<Vec<PhysicalSortRequirement>>> {

Review Comment:
   I actually spent some time trying to test bringing this change to IOx and I 
found this particular part quite challenging:
   1. Figuring out `Vec<Option<Vec<..>>>` and how to 
   2. `make_sort_requirements_from_exprs` appears not to be public so I 
couldn't use it.
   
   If we are going to change the signature of these traits, I think we should 
think about how make it easier to use.
   
   For example,  I wonder if we could make a structure like `Distribution` to 
https://docs.rs/datafusion/20.0.0/datafusion/physical_plan/enum.Distribution.html
 to describe an `Order`? 
   
   
   This would both 
   1. make it more convenient to construct these lists 
   2. provide a place to document what they are and all the subtlety 
   
   Perhaps something like 
   
   ```rust
   impl ExecutionPlan {
       // the order required for the plans inputs
       fn required_input_ordering(&self) -> Vec<Ordering> ;
   
       // output order produced by this plan
       fn output_ordering(&self) -> Ordering ;
   }
   ```
   
   And then encapsulate all the gory details about orderings and comparing them 
in `Ordering`:
   
   ```rust 
   struct Ordering {
     exprs: Vec<dyn PhysicalExpr>
     // parallel list if there are any particular requirements on the sort 
orders
     required_options: Vec<SortOptions>
   }
   
   impl RequiredInputOrdering {
     // construct a required input 
     pub fn from_sort_exprs(exprs: Vec<dyn PhysicalSortExpr>) -> Self {
     .. 
     }
    /// does this represent any requirement on the input ordering
    fn input_order_required(&self) ->  bool {}
   
    // Is there any NULLS FIRST / NULLS LAST 
   ...
   }
   ```
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to