YinY1 commented on issue #25145: URL: https://github.com/apache/pulsar/issues/25145#issuecomment-3755986922
> [@YinY1](https://github.com/YinY1) I created https://github.com/lhotari/pulsar-playground/blob/master/src/main/java/com/github/lhotari/pulsar/playground/TestScenarioIssue25145.java based on PulsarBatchAckPseudoDemo. I wasn't able to reproduce the problem after fixing the race condition. Thank you so much for providing the fix code! And I apologize for not addressing my code issue in a timely manner. I used the code you provided and made some simple modifications so that I could run loop tests, and I modified the number of messages to produced in each iteration. The main changes are as follows: ```bash + static boolean inconsistentFound = false; // ... // Find and report any message that wasn't received by BOTH subscriptions. for (MessageId sentId : sentMessageIds) { Set<String> receivedBy = receiptTracker.getOrDefault(sentId, Collections.emptySet()); if (receivedBy.size() < 2) { + inconsistentFound = true; if (!receivedBy.contains("sub-1")) { System.err.printf("[%s] not received from [sub-1]!%n", sentId); } if (!receivedBy.contains("sub-2")) { System.err.printf("[%s] not received from [sub-2]!%n", sentId); } } } System.out.println("Shutting down..."); client.close(); System.out.println("Done."); + + if (inconsistentFound) { + System.err.println("Inconsistency detected!"); + System.exit(1); + } } ``` Additionally, I used a script to execute the program in a loop: ```bash iteration=0 while true; do iteration=$((iteration + 1)) echo "====== Iteration ${iteration} ======" echo "Start time: $(date)" rm -rf /workspace/output mkdir -p /workspace/output current_time=$(date +%Y%m%d-%H%M%S) java -jar /workspace/target/test-scenario-issue-25145-1.0-SNAPSHOT.jar 2>&1 | tee /workspace/log/output_${current_time}.log rc=${PIPESTATUS[0]} if [ "${rc}" -eq 0 ]; then echo "====== Iteration ${iteration} completed successfully ======" echo "Sleeping 10s before next iteration..." sleep 10 continue else echo "====== Iteration ${iteration} failed with exit code ${rc} ======" echo "Inconsistency detected. Total iterations run: ${iteration}" echo "Log saved to: /workspace/log/output_${current_time}.log" exit ${rc} fi done ``` After running for a while, the issue reappeared. Here is the output log: ``` ====== Iteration 7 ====== Start time: Thu Jan 15 04:05:10 PM UTC 2026 SLF4J(W): No SLF4J providers were found. SLF4J(W): Defaulting to no-operation (NOP) logger implementation SLF4J(W): See https://www.slf4j.org/codes.html#noProviders for further details. Starting producer task... Starting consumer task for subscription [sub-2]... Starting consumer task for subscription [sub-1]... Finished producer task. Finished consumer task for subscription [sub-1]. Finished consumer task for subscription [sub-2]. --- Total Acked Messages per Subscription --- Subscription [sub-1]: 100000 acks Subscription [sub-2]: 100000 acks Shutting down... Done. ====== Iteration 7 completed successfully ====== Sleeping 10s before next iteration... ====== Iteration 8 ====== Start time: Thu Jan 15 04:05:51 PM UTC 2026 SLF4J(W): No SLF4J providers were found. SLF4J(W): Defaulting to no-operation (NOP) logger implementation SLF4J(W): See https://www.slf4j.org/codes.html#noProviders for further details. Starting producer task... Starting consumer task for subscription [sub-1]... Starting consumer task for subscription [sub-2]... Finished producer task. Finished consumer task for subscription [sub-2]. Finished consumer task for subscription [sub-1]. --- Total Acked Messages per Subscription --- Subscription [sub-1]: 99999 acks Subscription [sub-2]: 100000 acks [5426:163:7:582] not received from [sub-1]! Shutting down... Done. Inconsistency detected! ====== Iteration 8 failed with exit code 1 ====== Inconsistency detected. Total iterations run: 8 ``` Unfortunately, I didn't enable the debug log for AckGroupingTracker, so I'm unable to provide the tracker's status at that time. I've tested in different broker versions, (multi replicas, no standalone): 2.9.0, 4.0.8 and 4.1.1, and bug still occurred. As I mentioned above, I prefer this bug is caused by `batchIndexAcknowledgement` or `PersistentAcknowledgmentsGroupingTracker`. Additionally, I run this test demo and broker 4.0.8 on k8s. Thanks for your patience again! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
