[PR] [python] Push limit down to the reader layer for PK merge-on-read [paimon]

via GitHub Sun, 10 May 2026 10:54:05 -0700


TheR1sing3un opened a new pull request, #7808:
URL: https://github.com/apache/paimon/pull/7808


   ## Background
   
   PR #7742 fixed ``with_limit`` at the **scan** layer: ``TableScan`` /
   ``FileScanner`` now drop splits whose row counts exceed the budget.
   The **reader** layer, however, still drained every retained split to
   completion before the consumer trimmed the result. On PK
   merge-on-read in particular, ``with_limit(5)`` would happily merge
   hundreds or thousands of rows per split and discard all but the first
   five at ``to_arrow`` — the IO and CPU cost was effectively unbounded
   in the limit value.
   
   ## Effect
   
   Same query now stops at exactly N rows. The merge pipeline gains a
   ``LimitedRecordReader`` wrapper at its outermost stage, and
   ``TableRead`` tracks a counter across splits so it stops opening
   further splits once the budget is met. The Ray path is capped on top
   with ``ds.limit(N)`` so independent workers can't collectively
   overshoot.
   
   ## Commits
   
   1. **Add LimitedRecordReader for row-level limit pushdown** — the
      wrapper plus 9 unit tests, including a ``read_batch_calls``
      counter assertion that proves the inner reader is not pulled past
      the limit.
   2. **Push limit down through TableRead and MergeFileSplitRead** —
      ``ReadBuilder.new_read`` → ``TableRead.limit`` → cross-split
      counter in ``to_iterator`` / ``_arrow_batch_generator`` →
      ``MergeFileSplitRead`` wraps the merge unwrap → ``RayDatasource``
      forwards the limit and ``read_paimon`` / ``to_ray`` cap the final
      Ray Dataset.
   
   ## Tests
   
   - 9 unit tests for ``LimitedRecordReader`` (batch / iterator / close
     propagation / zero / negative / does-not-drain-inner).
   - 8 e2e cases in ``test_limit_pushdown.py``: append-only single
     split, spans-multiple-splits, zero, oversize, PK merge with
     multiple snapshots (4 different N values), PK merge with predicate
     + limit, and the ``to_iterator`` consumer.
   - Existing ``reader_*_test.py`` limit cases switch from the old
     "first-split-full" expectation to the new exact-N expectation.
   
   All read-path regression tests pass locally (85/85 across
   ``reader_pk``, ``reader_append_only``, ``file_store_commit``,
   streaming scan, split provider, ray integration).
   
   ## Out of scope
   
   - DataEvolution + limit row-level short-circuit: that path returns
     RecordBatchReaders end-to-end, which needs a separate batch-slice
     treatment; left as a follow-up.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] [python] Push limit down to the reader layer for PK merge-on-read [paimon]

Reply via email to