mucciolo opened a new pull request, #517:
URL: https://github.com/apache/pekko-persistence-jdbc/pull/517

   Fixes #516
   
   ### The bug
   
   `messagesWithBatch` backs `eventsByPersistenceId`, 
`currentEventsByPersistenceId` and actor recovery. Since the query windowing 
introduced in #180, a batch window `[from, from + batchSize]` containing fewer 
live messages than `batchSize` is treated as the end of the journal:
   
   * bounded streams complete early and silently drop every message beyond the 
gap;
   * live streams with a refresh interval poll the same empty window forever 
and never emit again.
   
   Gaps in sequence numbers are produced by `deleteMessages` (hard deletes 
below the retained marker row) and by cleanup of soft-deleted rows. #195 fixed 
this only for a gap at the head of the journal (the snapshot cleanup case). 
Upstream akka-persistence-jdbc does not have this bug. A full analysis and 
regression timeline is in #516.
   
   ### The fix
   
   The `unfoldAsync` state in `BaseJournalDaoWithReadMessages` becomes a small 
query-plan state machine:
   
   * `QueryRemaining(from)` queries the whole remaining range `[from, 
toSequenceNr]` with the batch-size LIMIT. It is planned first, and whenever a 
short windowed batch leaves the remainder undetermined, since gaps cannot hide 
messages from an unwindowed query.
   * `QueryWindow(from, endInclusive)` is the dense fast path, planned only 
after a full batch showed the journal to be dense. This keeps the perf 
safeguard of #180; it is not load-bearing for correctness because a short 
window falls back to `QueryRemaining`.
   * `PollRemaining(from, delay, scheduler)` is the live tail. It polls the 
full remaining range: a windowed poll can never reach messages appended beyond 
a trailing gap.
   * `Complete`.
   
   ```mermaid
   stateDiagram-v2
       [*] --> QueryRemaining
   
       QueryRemaining --> QueryWindow: full batch
       QueryRemaining --> PollRemaining: short batch, polling
       QueryRemaining --> Complete: reached the end, or short batch when not 
polling
   
       QueryWindow --> QueryWindow: full batch
       QueryWindow --> QueryRemaining: short batch before toSequenceNr (gap 
fallback)
       QueryWindow --> PollRemaining: short batch at toSequenceNr, polling
       QueryWindow --> Complete: reached the end, or short batch at 
toSequenceNr when not polling
   
       PollRemaining --> QueryWindow: full batch
       PollRemaining --> PollRemaining: short batch
       PollRemaining --> Complete: reached the end
   
       Complete --> [*]
   ```
   
   * **full batch**: the query returned `batchSize` messages; **short batch**: 
fewer.
   * **reached the end**: the last message is at or beyond `toSequenceNr` (or, 
on the first query, the requested range was already empty).
   * **polling**: a refresh interval is configured.
   
   The bug was the missing `QueryWindow` to `QueryRemaining` edge: a short 
windowed batch was treated as the end of the journal.
   
   Notes:
   
   * The first query spans the full remaining range, so a journal head purged 
up to a snapshot is crossed in one round trip, subsuming the #195 special case. 
The limited query costs the same as a windowed one when the journal is dense.
   * Cost of a gap on the dense path: one extra full-range limited query per 
short windowed batch. A dense journal issues the same queries as before, pinned 
by an interaction test.
   * Public signatures are unchanged.
   
   ### Tests
   
   * `MessagesWithBatchTest` (core): the gap state machine specified against an 
in-memory journal stub honoring the `messages` contract (ascending order, 
inclusive bounds, LIMIT). 26 tests covering gaps narrower and wider than the 
batch size, leading and trailing gaps, live polling across gaps, bounded live 
completion, failure pass-through and the two interaction tests pinning the 
first-query full range and the dense windowing of #180.
   * `MessagesWithBatchDatabaseContractTest` (H2 in core; Postgres, MySQL, 
MariaDB, Oracle and SQL Server in integration-test): hard-deleted gaps, 
soft-deleted messages, a mixed hard/soft gap, a prefix purge through the 
journal's `delete`, and a corrupt-row serialization failure.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to