merlimat opened a new pull request, #25975: URL: https://github.com/apache/pulsar/pull/25975
### Motivation The metadata transaction buffer's durable aborted-transaction records (`/txn/segment-state/<segment>/aborted/<txnId>`) and its in-memory `abortedTxns` set grew for the segment's **entire lifetime**: - They were never pruned when the segment's managed ledger **trimmed** the underlying data (retention / ack-based deletion within a live segment). - They were never deleted when the whole **segment/topic was dropped** (`deleteSegmentWatermark` was dead code; `deleteScalableTopic` left the records orphaned). The result is storage- and heap-growth proportional to the total number of aborts over the segment's life, not to the live data. This is the trim/drop GC the design always intended (each aborted record already stores its max position via `IDX_TXN_ABORTED_BY_POSITION`) but never wired up. It is a resource leak, not a data-correctness bug: trimmed positions are never dispatched, so their abort-filtering is no longer needed, and transaction ids never repeat. ### Modifications - **`MetadataTransactionBuffer` — ML-trim GC.** Schedule a periodic task on the broker executor (cadence = `transactionCoordinatorScalableTopicsGcIntervalSeconds`; cancelled on close; null-guarded so mocked unit tests drive it directly). `pruneTrimmedAbortedTxns()` reads `ledger.getFirstPosition()` and range-deletes the durable aborted records — and evicts their `abortedTxns` entries — whose highest position in the segment is strictly below it (fully trimmed). Records for still-readable data are retained. New aborts land at the tail, so they are never pruned. - **`TxnMetadataStore.deleteAllSegmentState(segment)`** — delete every aborted record plus the watermark for a segment (idempotent). - **`ScalableTopicService.deleteScalableTopic`** — when transactions are enabled, clean up each segment's `/txn/segment-state` alongside deleting the segment topic. ### Verifying this change Covered by unit tests: - `MetadataTransactionBufferTest.pruneTrimmedAborted_dropsBelowFirstValid_retainsAbove` — a fully-trimmed aborted txn is dropped from both the in-memory set and the durable store (verified via a fresh TB's recovery); a still-readable aborted txn stays filtered. - `TxnMetadataStoreTest.deleteAllSegmentState_removesAbortedRecordsAndWatermark`. - `ScalableTopicServiceTest` (delete wiring) and `V5TransactionScalableTest` (e2e abort + topic cleanup) pass. ### Does this pull request potentially affect one of the following parts: - Dependencies: no - The public API: no - The schema: no - The default values of configurations: no - The threading model: a per-segment-buffer periodic GC task is scheduled on the existing broker executor - The binary protocol: no - The REST endpoints: no - The admin CLI options: no - Anything that affects deployment: no -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
