void-ptr974 commented on code in PR #25873:
URL: https://github.com/apache/pulsar/pull/25873#discussion_r3311458193
##########
pip/pip-480.md:
##########
@@ -0,0 +1,135 @@
+# PIP-474: Add cursorless readEntries API to ManagedLedger
+
+# Background knowledge
+
+Most `ManagedLedger` reads go through `ManagedCursor`.
+A cursor is durable state: read position, mark-delete position, and individual
deleted entries.
+That is the right model for Pulsar subscriptions because the broker owns
subscription progress and uses cursor state for
+acknowledgement, backlog accounting, and ledger retention.
+
+Protocol handlers such as KoP do not always want that state. Kafka offsets are
maintained outside Pulsar's
+`ManagedCursor` abstraction. If KoP opens a cursor only to fetch entries, it
creates extra managed-ledger metadata and
+then has to keep the cursor lifecycle aligned with Kafka offsets.
+
+`ManagedLedger` already exposes `asyncReadEntry(Position, ...)` for reading a
single entry by position, but it does not
+provide a batch read primitive for this use case. Callers have to repeat the
ledger traversal logic themselves: validate
+the start position, cross ledger boundaries, skip empty ledgers, and stop at
the current last confirmed entry.
+
+# Motivation
+
+Provide a cursorless read path for downstream projects like KoP.
+
+A caller should be able to pass a `Position` and a maximum entry count, and
get the entries that are readable now.
+The call must not open a `ManagedCursor`, update cursor metadata, or change
acknowledgement state.
+The downstream protocol remains responsible for its own offsets.
+
+# Goals
+
+## In Scope
+
+* Add a `ManagedLedger` API to read a batch of entries from a given position
without a cursor.
+* Treat an existing start position as inclusive.
+* Allow reads to cross ledger boundaries and skip empty or removed empty
ledgers.
+* Support `PositionFactory.EARLIEST` and `PositionFactory.LATEST`.
+* Return only entries that are already readable when the call is made.
+* Ignore cursor acknowledgement state.
+
+## Out of Scope
+
+* Replace `ManagedCursor` for Pulsar subscriptions.
+* Use cursorless reads to pin backlog or change ledger retention semantics.
+* Apply dispatcher-level filtering, such as subscription acknowledgement
state, delayed delivery, or transaction
+ visibility.
+
+# High Level Design
+
+Add a new asynchronous method to `ManagedLedger`:
+
+```java
+CompletableFuture<List<Entry>> asyncReadEntries(Position startPosition, int
numberOfEntries);
+```
+
+The method returns up to `numberOfEntries` raw managed-ledger entries starting
from `startPosition`.
+It does not create, update, or consult a `ManagedCursor`.
+
+Before reading, the implementation normalizes the start position:
+
+* `PositionFactory.EARLIEST` starts from the first available entry.
+* `PositionFactory.LATEST` starts after the current last confirmed entry.
+* An existing entry position is used directly.
+* A missing position before the end of the readable range is moved to the next
valid entry.
+* A position after the current last confirmed entry returns an empty list.
+
+The current ledger is read through the active write handle. Closed ledgers are
read through the existing read-handle
+cache. The operation reads one ledger range at a time and continues until it
reaches the requested count or the current
+readable end of the managed ledger.
+
+# Detailed Design
+
+## Public-facing Changes
+
+### Public API
+
+Add the following method in `ManagedLedger`:
+
+```java
+/**
+ * Read entries from the managed ledger starting from the provided position.
+ *
+ * <p>The start position is inclusive when it points to an existing entry. If
it points to a non-existing entry,
+ * the read starts from the next valid entry in ledger order. {@link
PositionFactory#EARLIEST} starts from the first
+ * available entry, while {@link PositionFactory#LATEST} starts from the
position after the current last confirmed
+ * entry. This method does not wait for future writes and will complete with
fewer entries, or an empty list, when
+ * there are not enough currently readable entries.
+ *
+ * <p>The returned entries are raw ledger entries and are not filtered by any
cursor acknowledgement state. Callers
+ * are responsible for releasing returned entries.
+ */
+CompletableFuture<List<Entry>> asyncReadEntries(Position startPosition, int
numberOfEntries);
Review Comment:
Could we also define the maxSize semantics here, especially how it interacts
with numberOfEntries?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]