wwj6591812 opened a new pull request, #7994:
URL: https://github.com/apache/paimon/pull/7994

   
   
   ## Problem
   
   When `file-index.read.enabled=true` (default) and a query has only `LIMIT N` 
(no filter), Paimon pushes the limit down as a bitmap selection and wraps the 
ORC reader with `ApplyBitmapIndexRecordReader`.
   
   After row N, `ApplyBitmapIndexFileRecordIterator.next()` returns `null` 
because `position > last`. However, `RecordReaderIterator` treats iterator 
`null` as **batch exhaustion**, not **reader exhaustion**, and keeps calling 
`readBatch()` until EOF. This causes unnecessary full-file ORC I/O even though 
only N rows are needed.
   
   This affects any caller using `RecordReader.toCloseableIterator()` or 
`forEachRemaining()`, not only Flink.
   
   Related Flink-side fix: https://github.com/apache/paimon/pull/7991  
   That PR stops the dedicated split path from calling `hasNext()` after the 
limit is reached in `ReadOperator`. **This PR fixes the root cause in the core 
reader layer** and benefits all read paths.
   
   ## Root Cause
   
   `ApplyBitmapIndexFileRecordIterator` stops yielding rows once `position > 
last`, but `ApplyBitmapIndexRecordReader.readBatch()` had no global exhausted 
state. So `RecordReaderIterator.advanceIfNeeded()` interpreted the `null` as 
"current batch finished, open the next batch" and looped until EOF.
   
   ## Fix
   
   Track bitmap selection exhaustion in `ApplyBitmapIndexRecordReader` with an 
`AtomicBoolean`:
   
   1. When `ApplyBitmapIndexFileRecordIterator` sees `position > last`, it sets 
`exhausted = true`.
   2. Subsequent `readBatch()` calls return `null` immediately, so 
`RecordReaderIterator` stops without scanning the rest of the file.
   
   Minimal change: only `ApplyBitmapIndexRecordReader` and 
`ApplyBitmapIndexFileRecordIterator` are modified.
   
   ## Relationship to #7991
   
   | PR | Layer | Scope |
   |----|-------|-------|
   | #7991 | Flink `ReadOperator` | Dedicated split path; stops extra 
`hasNext()` after limit |
   | This PR | Core `ApplyBitmapIndexRecordReader` | All callers of 
`toCloseableIterator()` / `forEachRemaining()` |
   
   The two PRs are **independent** and can be reviewed/merged separately. 
Either one fixes the reported `LIMIT N` performance issue on the dedicated 
path; together they provide defense in depth.
   
   ## Testing
   
   - `ApplyBitmapIndexRecordReaderTest` — mock reader with 100 rows (batch size 
20), limit 10 → 10 rows returned, underlying `readBatch` called once; covers 
`toCloseableIterator()` and `forEachRemaining()`; sparse bitmap case
   - `AppendOnlySimpleTableTest.testLimitWithCloseableIterator` — 5000-row 
Parquet table, limit 10 via `toCloseableIterator()`
   - `BatchFileStoreITCase.testDedicatedPathLimitTenOnManyRows` — 100 rows 
INSERT, dedicated split + `LIMIT 10` → 10 rows
   
   ```bash
   mvn test -pl paimon-common -Dtest=ApplyBitmapIndexRecordReaderTest
   
   mvn test -pl paimon-core 
-Dtest=AppendOnlySimpleTableTest#testLimitWithCloseableIterator
   
   mvn test -pl paimon-flink/paimon-flink-common \
     -Dtest=BatchFileStoreITCase#testDedicatedPathLimitTenOnManyRows


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to