wwj6591812 opened a new pull request, #8116:
URL: https://github.com/apache/paimon/pull/8116

   ## Background
   
   For primary-key tables with multiple overlapping level-0 files, batch 
queries often take the **merge read** path (`MergeFileSplitRead`) instead of 
raw reads. Today, `LIMIT` is mainly effective at the **manifest/scan** layer 
(no overlapping files) or **Flink operator** layer (per subtask). The first 
split can still run a full LSM merge and spill even when the query only needs a 
few rows.
   
   ## Why this PR
   
   We want to stop merge I/O and CPU **as early as the record reader**, right 
after merge output is produced, for the common `SELECT … LIMIT n` case on 
multi-L0 buckets.
   
   ## What changes
   
   - Add `LimitRecordReader` in `paimon-common` and use it from 
`MergeFileSplitRead` (merge + no-merge readers).
   - Reuse the same helper in `FormatTableRead` (small refactor, no behavior 
change).
   - **Safety gate** before applying limit (aligned with 
`KeyValueFileStoreScan#limitPushdownEnabled`, plus non-PK filters): skip 
merge-read limit when filters cannot be fully applied on overlapping L0 
sections, DV is enabled, merge-engine is partial-update/aggregation, or 
`forceKeepDelete` is on.
   - **Logging**: one INFO log per reader when merge read limit is applied or 
disabled, with the reason.
   
   **Stage optimized:** per-split **read** phase (merge tree reader), after 
merge but before returning rows — not manifest planning.
   
   ## Tests
   
   - `MergeFileSplitReadTest#testWithLimit` — multi-L0 bucket, direct read 
stops at limit.
   - `MergeFileSplitReadTest#testWithLimitDisabledByNonPrimaryKeyFilter` — 
non-PK filter disables merge read limit.
   - `MergeFileSplitReadLimitITCase` — Flink batch SQL on multi-L0 PK table 
(`LIMIT` + `WHERE` correctness).
   
   ## Limitations
   
   - Limit is **per split reader**, not global across splits (global `LIMIT` 
still relies on engine-level limiters).
   - Does not avoid merge work inside a section before the first row is emitted.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to