RockteMQ-AI commented on issue #10494:
URL: https://github.com/apache/rocketmq/issues/10494#issuecomment-4709868251

   **Issue Evaluation**
   
   Category: `type/bug` | Status: **Confirmed**
   
   Verified against the current codebase. The race condition is real:
   
   1. `setUp()` (line 106) waits only for the **slave-side** `HAClient` to 
enter `TRANSFER` state.
   2. `DefaultHAConnection.slaveAckOffset` (line 58) and 
`AutoSwitchHAConnection.slaveAckOffset` (line 84) are both initialized to `-1`.
   3. The master-side `slaveAckOffset` is only updated when the master receives 
the slave's first offset report (`DefaultHAConnection` line 230 / 
`AutoSwitchHAConnection` line 341).
   4. `testSemiSyncReplica()` immediately starts semi-sync writes 
(`totalReplicas=2, inSyncReplicas=2`), which require the slave to have acked up 
to the current offset.
   
   On slower CI machines, the first `asyncPutMessage` can race the initial 
slave ack report and return `FLUSH_SLAVE_TIMEOUT` instead of `PUT_OK`.
   
   **Root Cause:** Missing wait condition — test should verify master-side 
`slaveAckOffset >= 0` (or the slave's current max physical offset) before 
sending messages.
   **Impact:** Flaky CI failures in `HATest.testSemiSyncReplica`.
   **Severity:** Low — test-only issue, no production impact.
   
   The proposed fix is correct. A PR would be welcome.
   
   ---
   *Automated evaluation by RockteMQ-AI*


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to