merlimat opened a new pull request, #25959:
URL: https://github.com/apache/pulsar/pull/25959

   ### Motivation
   
   Scalable-topics (v5) transactions never timed out at the requested value: a 
1-minute timeout was applied as **~16.7 hours**, so the coordinator's timeout 
sweep effectively never aborted a dangling (never-committed) transaction, 
leaving its transaction buffer pinned indefinitely (which also blocks 
visibility of later writes on the same segment).
   
   **Root cause.** The `NEW_TXN` handler's v5 branch in `ServerCnx` multiplied 
the wire ttl by 1000. But the ttl field has always carried **milliseconds** — 
the client sends `unit.toMillis(...)`, and both the legacy coordinator 
(`TransactionMetadataStoreService.newTransaction`, parameter `timeoutInMills`) 
and the v5 coordinator (`TransactionCoordinatorV5.newTransaction`, 
`Duration.ofMillis(...)`) consume milliseconds. The field's legacy name, 
`txn_ttl_seconds`, was a misnomer that invited the bug.
   
   ### Modifications
   
   - `ServerCnx.handleNewTxn`: drop the `* 1000` on the v5 path; pass the 
millisecond value straight through, matching the legacy path.
   - Rename the wire field `CommandNewTxn.txn_ttl_seconds` → `txn_ttl_millis` 
so the unit confusion can't recur, and update `Commands.newTxn` accordingly.
   - Add `V5TransactionTimeoutTest`: a never-committed v5 transaction must be 
aborted by the broker timeout sweep, unpinning the buffer so a later 
non-transactional message becomes visible.
   
   ### Compatibility
   
   The binary protocol is **unchanged**: the field number (`2`) and type 
(`uint64`) are identical — only the field's name and the generated Java 
accessor names change (`getTxnTtlSeconds` → `getTxnTtlMillis`). Older and newer 
clients/brokers interoperate exactly as before.
   
   ### Verifying this change
   
   - New `V5TransactionTimeoutTest` reproduces the bug (failed before the fix, 
passes after).
   - No regression: `V5TransactionTest` 7/7, `V5TransactionScalableTest` 8/8, 
`V5TransactionRecoveryTest` 3/3, `TransactionCoordinatorV5Test` 22/22, 
`CommandsScalableTxnFlagTest` 6/6; checkstyle clean.
   
   ### Does this pull request potentially affect one of the following parts:
   
   - Dependencies: no
   - The public API: no
   - The schema: no
   - The default values of configurations: no
   - The threading model: no
   - The binary protocol: no (field number and type unchanged; only the field 
name)
   - The REST endpoints: no
   - The admin CLI options: no
   - Anything that affects deployment: no
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to