thetumbled opened a new issue, #20150: URL: https://github.com/apache/pulsar/issues/20150
### Search before asking - [X] I searched in the [issues](https://github.com/apache/pulsar/issues) and found nothing similar. ### Version master. ### Minimal reproduce step just restart brokers. ### What did you expect to see? `Committed` transactions can not be rollback to previous status. ### What did you see instead? Some transactions are `committed` before broker's restart, but become `open` status after the broker is restarted. I add some logs in broker as follows: - log found from `broker.out.1`, which is the log logged before restarting broker. ``` 2023-04-19T12:18:51,396+0800 [pulsar-io-36-8] DEBUG org.apache.pulsar.broker.service.ServerCnx - Receive add published partition to txn request 2376042138963580900 from /164.90.89.87:33256 with txnId (11,155293), topic: [persistent://test/tb1/testTxn10-partition-7] 2023-04-19T12:18:51,400+0800 [pulsar-io-36-8] DEBUG org.apache.pulsar.broker.service.ServerCnx - Receive add published partition to txn request 2376042138963580901 from /164.90.89.87:33256 with txnId (11,155293), topic: [persistent://test/tb1/testTxn10-partition-8] 2023-04-19T12:18:51,408+0800 [pulsar-io-36-1] DEBUG org.apache.pulsar.broker.transaction.buffer.impl.TopicTransactionBuffer - Append buffer to txn (11,155293) 2023-04-19T12:18:51,408+0800 [pulsar-io-36-37] DEBUG org.apache.pulsar.broker.transaction.buffer.impl.TopicTransactionBuffer - Append buffer to txn (11,155293) 2023-04-19T12:18:51,415+0800 [pulsar-io-36-8] DEBUG org.apache.pulsar.broker.service.ServerCnx - Receive end txn request 2376042138963580907 to transaction meta store TransactionCoordinatorID(id=11) for txnId:(11,155293). 2023-04-19T12:18:51,449+0800 [BookKeeperClientWorker-OrderedExecutor-6-0] DEBUG org.apache.pulsar.broker.transaction.buffer.impl.TransactionBufferHandlerImpl - [persistent://test/tb1/testTxn10-partition-7] endTxnOnTopic txnId: [(11,155293)], txnAction: [0] 2023-04-19T12:18:51,449+0800 [BookKeeperClientWorker-OrderedExecutor-6-0] DEBUG org.apache.pulsar.broker.transaction.buffer.impl.TransactionBufferHandlerImpl - [persistent://test/tb1/testTxn10-partition-8] endTxnOnTopic txnId: [(11,155293)], txnAction: [0] 2023-04-19T12:18:51,449+0800 [pulsar-io-36-32] DEBUG org.apache.pulsar.broker.service.ServerCnx - [persistent://test/tb1/testTxn10-partition-7] handleEndTxnOnPartition txnId: [(11,155293)], txnAction: [0] 2023-04-19T12:18:51,449+0800 [pulsar-io-36-32] DEBUG org.apache.pulsar.broker.transaction.buffer.impl.TopicTransactionBuffer - Transaction (11,155293) commit on topic persistent://test/tb1/testTxn10-partition-7. 2023-04-19T12:18:51,449+0800 [pulsar-io-36-32] DEBUG org.apache.pulsar.broker.service.ServerCnx - [persistent://test/tb1/testTxn10-partition-8] handleEndTxnOnPartition txnId: [(11,155293)], txnAction: [0] 2023-04-19T12:18:51,449+0800 [pulsar-io-36-32] DEBUG org.apache.pulsar.broker.transaction.buffer.impl.TopicTransactionBuffer - Transaction (11,155293) commit on topic persistent://test/tb1/testTxn10-partition-8. 2023-04-19T12:18:51,508+0800 [BookKeeperClientWorker-OrderedExecutor-6-0] DEBUG org.apache.pulsar.transaction.coordinator.impl.MLTransactionMetadataStore - TxnID : (11,155293) update txn status to COMMITTED ``` - log found from `broker.out`, which is the log logged after restarting broker. ``` 2023-04-19T12:19:26,414+0800 [BookKeeperClientWorker-OrderedExecutor-6-0] DEBUG org.apache.pulsar.broker.transaction.buffer.impl.TransactionBufferHandlerImpl - [persistent://test/tb1/testTxn10-partition-7] endTxnOnTopic txnId: [(11,155293)], txnAction: [1] 2023-04-19T12:19:26,415+0800 [BookKeeperClientWorker-OrderedExecutor-6-0] DEBUG org.apache.pulsar.broker.transaction.buffer.impl.TransactionBufferHandlerImpl - [persistent://test/tb1/testTxn10-partition-8] endTxnOnTopic txnId: [(11,155293)], txnAction: [1] 2023-04-19T12:19:26,415+0800 [pulsar-io-36-54] DEBUG org.apache.pulsar.broker.service.ServerCnx - [persistent://test/tb1/testTxn10-partition-7] handleEndTxnOnPartition txnId: [(11,155293)], txnAction: [1] 2023-04-19T12:19:26,415+0800 [pulsar-io-36-54] DEBUG org.apache.pulsar.broker.transaction.buffer.impl.TopicTransactionBuffer - Transaction (11,155293) abort on topic persistent://test/tb1/testTxn10-partition-7. 2023-04-19T12:19:26,415+0800 [pulsar-io-36-54] DEBUG org.apache.pulsar.broker.service.ServerCnx - [persistent://test/tb1/testTxn10-partition-8] handleEndTxnOnPartition txnId: [(11,155293)], txnAction: [1] 2023-04-19T12:19:26,415+0800 [pulsar-io-36-54] DEBUG org.apache.pulsar.broker.transaction.buffer.impl.TopicTransactionBuffer - Transaction (11,155293) abort on topic persistent://test/tb1/testTxn10-partition-8. 2023-04-19T12:19:26,425+0800 [BookKeeperClientWorker-OrderedExecutor-6-0] DEBUG org.apache.pulsar.transaction.coordinator.impl.MLTransactionMetadataStore - TxnID : (11,155293) update txn status to ABORTED ``` and i found some logs in `TransactionRecoverTrackerImpl`. ``` 2023-04-19T12:19:24,745+0800 [transaction_coordinator_TransactionCoordinatorID(id=11)_thread_factory-89-1] DEBUG org.apache.pulsar.broker.transaction.recover.TransactionRecoverTrackerImpl - Append open transaction to timeout tracker, tcId: 11, openTransactions: {155294=1681877941414, 155293=1681877941393} ``` Based on logs above, we can see that the transaction `(11,155293)` is rollback to `open` status, and is aborted due to transaction timeout machanism. **Important clue: Such problem will only occurs when the TC log batch write feature is enabled.** ### Anything else? _No response_ ### Are you willing to submit a PR? - [ ] I'm willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
