merlimat opened a new pull request, #25884: URL: https://github.com/apache/pulsar/pull/25884
## Summary Adds the periodic background sweeps for the scalable-topics transaction coordinator landed in [#25863](https://github.com/apache/pulsar/pull/25863) (P5.1). Both run on a dedicated single-thread scheduler started by `PulsarService` and gated on assign-partition-0 ownership — only the elected broker sweeps each cycle. Concurrent sweeps from a stale owner remain safe because every state transition is a header CAS; the election is purely an efficiency measure (per-partition scoping comes with the partitioned TC in a later phase). ### Timeout sweep Default cadence **60s**. Scans the by-deadline index up to `now` and drives each expired open txn through `endTransaction(ABORT)`, which re-reads and CAS-guards the header — so a txn the client commits in the same window is left alone (the resulting `InvalidTxnStatusException` / `BadVersionException` is treated as a benign race and logged at debug). ### GC sweep Default cadence **300s**, retention **900s**. For each terminal state, scans the by-final-state index up to `now - retention`. For each candidate: - If leftover `/txn/op` records remain — some participant hasn't applied the outcome yet, or never received the event (e.g. the TC crashed between the header CAS and the fan-out) — re-drive `fanOutEvents` and **leave the header in place** so the participant can re-read the true outcome. It removes its op records once it applies them, and a later GC pass — seeing no op records — deletes the header. - If no op records remain, delete the header. This ordering closes the fan-out-durability gap [lhotari raised on #25863](https://github.com/apache/pulsar/pull/25863#discussion_r3298435980) without ever stranding a committed txn's data: we never delete a header while a participant might still re-read it (which would default the outcome to ABORTED). ### Config | Key | Default | |---|---| | `transactionCoordinatorScalableTopicsTimeoutSweepIntervalSeconds` | 60 | | `transactionCoordinatorScalableTopicsGcIntervalSeconds` | 300 | | `transactionCoordinatorScalableTopicsGcRetentionSeconds` | 900 | All only meaningful when `transactionCoordinatorScalableTopicsEnabled = true` (still off by default). ### Drive-by Refactored `fanOutEvents` to use `FutureUtil.waitForAll(List<CompletableFuture<Void>>)` — matches the new sweep methods and addresses the same comment lhotari left on P5.1. ## Test plan - [x] `pulsar-broker:test --tests TransactionCoordinatorV5Test` — 5 new sweep cases plus all P5.1 cases: - `sweepTimeouts_abortsExpiredOpenTxnAndFansOut` - `sweepTimeouts_leavesUnexpiredOpenTxnAlone` - `sweepGc_deletesHeaderWhenNoOpsRemain` - `sweepGc_repairsAndRetainsHeaderWhenOpsRemain` (the fan-out-durability scenario) - `sweeps_skipWhenNotElected` - [x] `pulsar-broker:test --tests TxnMetadataStoreTest` / `MetadataTransactionBufferTest` / `MetadataPendingAckStoreTest` — green. - [x] Checkstyle clean (main + test). ## Deferred / follow-ups - **Per-partition sweep scoping** lands with the partitioned TC (P5.3), replacing the single-elected-sweeper interim. - **Pure metadata-store leader election** also belongs to P5.3. - A leftover op record from a permanently-gone participant (segment deleted) currently keeps its header alive forever — the GC sweep keeps re-publishing harmlessly. A future phase can add a liveness check to force-cleanup, but doing so safely needs the participant-liveness signal that doesn't exist yet. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
