2010YOUY01 commented on code in PR #20271:
URL: https://github.com/apache/datafusion/pull/20271#discussion_r2791387347


##########
datafusion/physical-optimizer/src/limit_pushdown.rs:
##########
@@ -17,6 +17,32 @@
 
 //! [`LimitPushdown`] pushes `LIMIT` down through `ExecutionPlan`s to reduce
 //! data transfer as much as possible.
+//!
+//! # Plan Limit Absorption

Review Comment:
   Good point, updated!



##########
datafusion/physical-optimizer/src/limit_pushdown.rs:
##########
@@ -17,6 +17,32 @@
 
 //! [`LimitPushdown`] pushes `LIMIT` down through `ExecutionPlan`s to reduce
 //! data transfer as much as possible.
+//!
+//! # Plan Limit Absorption
+//! In addition to pushing down [`LimitExec`] in the plan, some operators can
+//! "absorb" a limit and stop early during execution. This optimizer rule also 
includes
+//! limit absorption transformation through [`ExecutionPlan::with_fetch`].
+//!
+//! ## Background: vectorized volcano execution model
+//! DataFusion uses a batched volcano model. For most operators, output is
+//! produced in batches of `datafusion.execution.batch_size` (default 8192), so
+//! the batch sizes typically look like:
+//! ```text
+//! 8192, 8192, ..., 8192, 100 (the final batch may be partial)
+//! ```
+//!
+//! ## Example
+//! For a join with an expensive, selective predicate:
+//! ```text
+//! LimitExec(fetch=10)
+//! -- NestedLoopJoinExec(on=expr_expensive_and_selective)
+//! --- DataSourceExec()
+//! --- DataSourceExec()
+//! ```
+//! Under this model, `NestedLoopJoinExec` would keep working until it can emit

Review Comment:
   Updated, thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to