poorbarcode opened a new pull request, #24945:
URL: https://github.com/apache/pulsar/pull/24945

   ### Motivation
   
   **Issue 1**: concurrently initialising transaction buffer snapshot
   Before https://github.com/apache/pulsar/pull/21406, the snapshot would be 
taken when the persistent topic is initialising, so no concurrency. After 
#21406, the transaction buffer snapshot is triggered by publishing messages, so 
concurrency occurs. #21406 forgot to handle this case, which caused the 
following errors
   
   ```
   2025-11-04T22:44:14,413 - WARN  - [pulsar-io-28-3:PersistentTopic] - 
[persistent://public/txn/tp-8064cb9f-1f8f-44f1-8bf2-872cc3870973] Failed to 
persist msg in store: 
org.apache.pulsar.broker.service.BrokerServiceException$ServiceUnitNotReadyException:
 Transaction Buffer take first snapshot failed, the current state is: Ready
   2025-11-04T22:44:14,413 - INFO  - [pulsar-io-28-3:PersistentTopic] - 
[persistent://public/txn/tp-8064cb9f-1f8f-44f1-8bf2-872cc3870973] Un-fencing 
topic...
   2025-11-04T22:44:14,414 - INFO  - [pulsar-client-io-96-3:ClientCnx] - 
[localhost/127.0.0.1:57291] Broker notification of closed producer: 0, 
assignedBrokerUrl: null, assignedBrokerUrlTls: null
   2025-11-04T22:44:14,412 - WARN  - [pulsar-client-io-262-3:ClientCnx] - [id: 
0xe9ef6b71, L:/127.0.0.1:57301 - R:localhost/127.0.0.1:57291] Received send 
error from server: PersistenceError : 
org.apache.bookkeeper.mledger.ManagedLedgerException: 
org.apache.pulsar.broker.service.BrokerServiceException$ServiceUnitNotReadyException:
 Transaction Buffer take first snapshot failed, the current state is: Ready
   2025-11-04T22:44:14,412 - WARN  - [pulsar-client-io-262-3:ClientCnx] - [id: 
0xe9ef6b71, L:/127.0.0.1:57301 - R:localhost/127.0.0.1:57291] Producer with id 
0 not found while handling send error
   2025-11-04T22:44:14,413 - INFO  - [pulsar-client-io-96-3:ProducerImpl] - 
[persistent://public/txn/tp-8064cb9f-1f8f-44f1-8bf2-872cc3870973] [test-0-1] 
Created producer on cnx [id: 0x2f0343b6, L:/127.0.0.1:57296 - 
R:localhost/127.0.0.1:57291]
   2025-11-04T22:44:14,413 - INFO  - [pulsar-client-io-96-3:ProducerImpl] - 
[persistent://public/txn/tp-8064cb9f-1f8f-44f1-8bf2-872cc3870973] [test-0-1] 
Re-Sending 1 messages to server
   2025-11-04T22:44:14,413 - INFO  - 
[broker-topic-workers-OrderedExecutor-8-0:ServerCnx] - [/127.0.0.1:57298] 
Created new producer: 
Producer{topic=PersistentTopic{topic=persistent://public/txn/tp-8064cb9f-1f8f-44f1-8bf2-872cc3870973},
 client=[id: 0x960ce44f, L:/127.0.0.1:57291 - R:/127.0.0.1:57298] 
[SR:127.0.0.1, state:Connected], producerName=test-0-3, producerId=0}, role: 
null
   2025-11-04T22:44:14,413 - INFO  - [pulsar-io-28-3:Producer] - Disconnecting 
producer: 
Producer{topic=PersistentTopic{topic=persistent://public/txn/tp-8064cb9f-1f8f-44f1-8bf2-872cc3870973},
 client=[id: 0x960ce44f, L:/127.0.0.1:57291 - R:/127.0.0.1:57298] 
[SR:127.0.0.1, state:Connected], producerName=test-0-3, producerId=0}, 
assignedBrokerLookupData: Optional.empty
   2025-11-04T22:44:14,413 - INFO  - [pulsar-io-28-3:Producer] - Disconnecting 
producer: 
Producer{topic=PersistentTopic{topic=persistent://public/txn/tp-8064cb9f-1f8f-44f1-8bf2-872cc3870973},
 client=[id: 0x039d7a90, L:/127.0.0.1:57291 - R:/127.0.0.1:57296] 
[SR:127.0.0.1, state:Connected], producerName=test-0-1, producerId=0}, 
assignedBrokerLookupData: Optional.empty
   2025-11-04T22:44:14,413 - WARN  - [pulsar-io-28-3:PersistentTopic] - 
[persistent://public/txn/tp-8064cb9f-1f8f-44f1-8bf2-872cc3870973] Failed to 
persist msg in store: 
org.apache.pulsar.broker.service.BrokerServiceException$ServiceUnitNotReadyException:
 Transaction Buffer take first snapshot failed, the current state is: Ready
   2025-11-04T22:44:14,413 - INFO  - [pulsar-io-28-3:PersistentTopic] - 
[persistent://public/txn/tp-8064cb9f-1f8f-44f1-8bf2-872cc3870973] Un-fencing 
topic...
   2025-11-04T22:44:14,414 - INFO  - [pulsar-client-io-96-3:ClientCnx] - 
[localhost/127.0.0.1:57291] Broker notification of closed producer: 0, 
assignedBrokerUrl: null, assignedBrokerUrlTls: null
   2025-11-04T22:44:14,414 - INFO  - [pulsar-client-io-163-3:ProducerImpl] - 
[persistent://public/txn/tp-8064cb9f-1f8f-44f1-8bf2-872cc3870973] [test-0-3] 
Created producer on cnx [id: 0xfbd3c65b, L:/127.0.0.1:57298 - 
R:localhost/127.0.0.1:57291]
   2025-11-04T22:44:14,414 - INFO  - [pulsar-client-io-96-3:ConnectionHandler] 
- [persistent://public/txn/tp-8064cb9f-1f8f-44f1-8bf2-872cc3870973] [test-0-1] 
Closed connection [id: 0x2f0343b6, L:/127.0.0.1:57296 - 
R:localhost/127.0.0.1:57291] -- Will try again in 0.1 s, hostUrl: null
   2025-11-04T22:44:14,414 - WARN  - [pulsar-client-io-96-3:ClientCnx] - [id: 
0x2f0343b6, L:/127.0.0.1:57296 - R:localhost/127.0.0.1:57291] Received send 
error from server: PersistenceError : 
org.apache.bookkeeper.mledger.ManagedLedgerException: 
org.apache.pulsar.broker.service.BrokerServiceException$ServiceUnitNotReadyException:
 Transaction Buffer take first snapshot failed, the current state is: Ready
   2025-11-04T22:44:14,414 - WARN  - [pulsar-client-io-96-3:ClientCnx] - [id: 
0x2f0343b6, L:/127.0.0.1:57296 - R:localhost/127.0.0.1:57291] Producer with id 
0 not found while handling send error
   2025-11-04T22:44:14,414 - INFO  - [pulsar-client-io-163-3:ClientCnx] - 
[localhost/127.0.0.1:57291] Broker notification of closed producer: 0, 
assignedBrokerUrl: null, assignedBrokerUrlTls: null
   ```
   
   **Issue 2: publishing messages before the transaction buffer is recovered.**
   Before https://github.com/apache/pulsar/pull/21406: a wrong variable was 
used when reconstructing the class, the correct variable should be 
`snapshotAbortedTxnProcessor`, but it used `publishFuture`. See follows: 
   
   - 
https://github.com/apache/pulsar/pull/21406/files#diff-ecd728301a585f256e8a649b5e65b28c166194477355b3a1eefc198d014c25d3L221
   - 
https://github.com/apache/pulsar/pull/21406/files#diff-ecd728301a585f256e8a649b5e65b28c166194477355b3a1eefc198d014c25d3R255
   
   This issue makes transaction buffer recovery and taking a transaction 
snapshot execute concurrently.
   
   ### Modifications
   
   Fix the two issues
   
   ### Documentation
   
   <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->
   
   - [ ] `doc` <!-- Your PR contains doc changes. -->
   - [ ] `doc-required` <!-- Your PR changes impact docs and you will update 
later -->
   - [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
   - [ ] `doc-complete` <!-- Docs have been already added -->
   
   ### Matching PR in forked repository
   
   PR in forked repository: x


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to