andygrove commented on a change in pull request #9043:
URL: https://github.com/apache/arrow/pull/9043#discussion_r549808136
##########
File path: rust/datafusion/src/physical_plan/planner.rs
##########
@@ -110,6 +111,16 @@ impl DefaultPhysicalPlanner {
// leaf node, children cannot be replaced
Ok(plan.clone())
} else {
+ // wrap filter in coalesce batches
+ let plan = if plan.as_any().downcast_ref::<FilterExec>().is_some()
{
+ let target_batch_size = ctx_state.config.batch_size;
+ Arc::new(CoalesceBatchesExec::new(plan.clone(),
target_batch_size))
Review comment:
I filed https://issues.apache.org/jira/browse/ARROW-11068 to wrap join
output and also to make this mechanism more generic.
Rather than hard-code a list of operators that need to be wrapped, we should
find a more generic mechanism so that plans can declare if their input and/or
output batches should be coalesced (similar to how we handle partitioning) and
this would allow custom operators outside of DataFusion to benefit from this
optimization.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]