YinY1 commented on issue #25145:
URL: https://github.com/apache/pulsar/issues/25145#issuecomment-3755986922

   > [@YinY1](https://github.com/YinY1) I created 
https://github.com/lhotari/pulsar-playground/blob/master/src/main/java/com/github/lhotari/pulsar/playground/TestScenarioIssue25145.java
 based on PulsarBatchAckPseudoDemo. I wasn't able to reproduce the problem 
after fixing the race condition.
   
   Thank you so much for providing the fix code! And I apologize for not 
addressing my code issue in a timely manner.
   
   I used the code you provided and made some simple modifications so that I 
could run loop tests, and I modified the number of messages to produced in each 
iteration. The main changes are as follows:
   
   ```bash
   +     static boolean inconsistentFound = false;
   
       // ...
   
                   // Find and report any message that wasn't received by BOTH 
subscriptions.
                   for (MessageId sentId : sentMessageIds) {
                       Set<String> receivedBy = 
receiptTracker.getOrDefault(sentId, Collections.emptySet());
                       if (receivedBy.size() < 2) {
     +                     inconsistentFound = true;
                           if (!receivedBy.contains("sub-1")) {
                               System.err.printf("[%s] not received from 
[sub-1]!%n", sentId);
                           }
                           if (!receivedBy.contains("sub-2")) {
                               System.err.printf("[%s] not received from 
[sub-2]!%n", sentId);
                           }
                       }
                   }
   
                   System.out.println("Shutting down...");
                   client.close();
                   System.out.println("Done.");
     + 
     +             if (inconsistentFound) {
     +                 System.err.println("Inconsistency detected!");
     +                 System.exit(1);
     +             }
               }
   ```
   
   Additionally, I used a script to execute the program in a loop:
   ```bash
   iteration=0
   while true; do
           iteration=$((iteration + 1))
           echo "====== Iteration ${iteration} ======"
           echo "Start time: $(date)"
           rm -rf /workspace/output
           mkdir -p /workspace/output
           current_time=$(date +%Y%m%d-%H%M%S)
   
           java -jar 
/workspace/target/test-scenario-issue-25145-1.0-SNAPSHOT.jar 2>&1 | tee 
/workspace/log/output_${current_time}.log
           rc=${PIPESTATUS[0]}
   
           if [ "${rc}" -eq 0 ]; then
               echo "====== Iteration ${iteration} completed successfully 
======"
               echo "Sleeping 10s before next iteration..."
               sleep 10
               continue
           else
               echo "====== Iteration ${iteration} failed with exit code ${rc} 
======"
               echo "Inconsistency detected. Total iterations run: ${iteration}"
               echo "Log saved to: /workspace/log/output_${current_time}.log"
               exit ${rc}
           fi
   done
   ```
   
   After running for a while, the issue reappeared. Here is the output log:
   
   ```
   ====== Iteration 7 ======
   Start time: Thu Jan 15 04:05:10 PM UTC 2026
   SLF4J(W): No SLF4J providers were found.
   SLF4J(W): Defaulting to no-operation (NOP) logger implementation
   SLF4J(W): See https://www.slf4j.org/codes.html#noProviders for further 
details.
   Starting producer task...
   Starting consumer task for subscription [sub-2]...
   Starting consumer task for subscription [sub-1]...
   Finished producer task.
   Finished consumer task for subscription [sub-1].
   Finished consumer task for subscription [sub-2].
   --- Total Acked Messages per Subscription ---
   Subscription [sub-1]: 100000 acks
   Subscription [sub-2]: 100000 acks
   Shutting down...
   Done.
   ====== Iteration 7 completed successfully ======
   Sleeping 10s before next iteration...
   ====== Iteration 8 ======
   Start time: Thu Jan 15 04:05:51 PM UTC 2026
   SLF4J(W): No SLF4J providers were found.
   SLF4J(W): Defaulting to no-operation (NOP) logger implementation
   SLF4J(W): See https://www.slf4j.org/codes.html#noProviders for further 
details.
   Starting producer task...
   Starting consumer task for subscription [sub-1]...
   Starting consumer task for subscription [sub-2]...
   Finished producer task.
   Finished consumer task for subscription [sub-2].
   Finished consumer task for subscription [sub-1].
   --- Total Acked Messages per Subscription ---
   Subscription [sub-1]: 99999 acks
   Subscription [sub-2]: 100000 acks
   [5426:163:7:582] not received from [sub-1]!
   Shutting down...
   Done.
   Inconsistency detected!
   ====== Iteration 8 failed with exit code 1 ======
   Inconsistency detected. Total iterations run: 8
   ```
   
   Unfortunately, I didn't enable the debug log for AckGroupingTracker, so I'm 
unable to provide the tracker's status at that time.
   
   I've tested in different broker versions, (multi replicas, no standalone): 
2.9.0, 4.0.8 and 4.1.1, and bug still occurred. As I mentioned above, I prefer 
this bug is caused by `batchIndexAcknowledgement` or 
`PersistentAcknowledgmentsGroupingTracker`.
   
   Additionally, I run this test demo and broker 4.0.8 on k8s.
   
   Thanks for your patience again!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to