andygrove commented on a change in pull request #9043:
URL: https://github.com/apache/arrow/pull/9043#discussion_r549808136



##########
File path: rust/datafusion/src/physical_plan/planner.rs
##########
@@ -110,6 +111,16 @@ impl DefaultPhysicalPlanner {
             // leaf node, children cannot be replaced
             Ok(plan.clone())
         } else {
+            // wrap filter in coalesce batches
+            let plan = if plan.as_any().downcast_ref::<FilterExec>().is_some() 
{
+                let target_batch_size = ctx_state.config.batch_size;
+                Arc::new(CoalesceBatchesExec::new(plan.clone(), 
target_batch_size))

Review comment:
       I filed https://issues.apache.org/jira/browse/ARROW-11068 to wrap join 
output and also to make this mechanism more generic.
   
   Rather than hard-code a list of operators that need to be wrapped, we should 
find a more generic mechanism so that plans can declare if their input and/or 
output batches should be coalesced (similar to how we handle partitioning) and 
this would allow custom operators outside of DataFusion to benefit from this 
optimization.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to