Copilot commented on code in PR #19400:
URL: https://github.com/apache/datafusion/pull/19400#discussion_r2634150529
##########
datafusion/physical-plan/src/coalesce_batches.rs:
##########
@@ -226,6 +227,19 @@ impl ExecutionPlan for CoalesceBatchesExec {
CardinalityEffect::Equal
}
+ fn try_swapping_with_projection(
+ &self,
+ projection: &ProjectionExec,
+ ) -> Result<Option<Arc<dyn ExecutionPlan>>> {
+ match self.input.try_swapping_with_projection(projection)? {
+ Some(new_input) => Ok(Some(Arc::new(
+ CoalesceBatchesExec::new(new_input, self.target_batch_size)
+ .with_fetch(self.fetch),
+ ))),
+ None => Ok(None),
+ }
+ }
Review Comment:
The implementation is missing a check to verify if the projection narrows
the schema before attempting to push it down. All similar operators in the
codebase (CoalescePartitionsExec, UnionExec, RepartitionExec, SortExec,
SortPreservingMergeExec) include this check as an optimization. Without this
check, the projection pushdown will be attempted even when it provides no
benefit (i.e., when the projection doesn't reduce the number of columns).
Consider adding this check at the beginning of the method to return early if
the projection doesn't narrow the schema.
##########
datafusion/physical-plan/src/coop.rs:
##########
@@ -298,6 +299,16 @@ impl ExecutionPlan for CooperativeExec {
Equal
}
+ fn try_swapping_with_projection(
+ &self,
+ projection: &ProjectionExec,
+ ) -> Result<Option<Arc<dyn ExecutionPlan>>> {
+ match self.input.try_swapping_with_projection(projection)? {
+ Some(new_input) =>
Ok(Some(Arc::new(CooperativeExec::new(new_input)))),
+ None => Ok(None),
+ }
+ }
Review Comment:
The implementation is missing a check to verify if the projection narrows
the schema before attempting to push it down. All similar operators in the
codebase (CoalescePartitionsExec, UnionExec, RepartitionExec, SortExec,
SortPreservingMergeExec) include this check as an optimization. Without this
check, the projection pushdown will be attempted even when it provides no
benefit (i.e., when the projection doesn't reduce the number of columns).
Consider adding this check at the beginning of the method to return early if
the projection doesn't narrow the schema.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]