Zanglei06 opened a new issue #2732:
URL: https://github.com/apache/rocketmq/issues/2732


   1. Please describe the issue you observed:
   
   - What did you do (The steps to reproduce)?
   
   - What did you expect to see?
   
   - What did you see instead?
   
   In our production environment, I find some msgs lost when a new consumer 
started(causing rebalance), the RMQ version we use is 4.7.1 and we use the new 
LitePullConconsumer API.
   
   From the rocketmq-client log, something unexpected happened:
   1. the same messageQueue is detected should cancel its pullTask in two 
different threads in almost the same time.
   
   2021-03-09 20:16:19.911 WARN  [PullMsgThread-c_g1] 
(Slf4jLoggerFactory.java:115) - The Pull Task is cancelled after doPullTask, 
MessageQueue [topic=t, brokerName=rmq-b5, queueId=3]
   2021-03-09 20:16:19.911 WARN  [PullMsgThread-c_g2] 
(Slf4jLoggerFactory.java:115) - The Pull Task is cancelled after doPullTask, 
MessageQueue [topic=t, brokerName=rmq-b5, queueId=3]
   
   2. before rebalance, there are two consumer(two consumer cid), when a new 
consumer is started, it becomes three(3 consumer cid), but the rebalance 
triggered several times with wrong cid count, which means findConsumerIdList 
API returns wrong value. I think the wrong rebalance should not cause any msg 
lost since rebalance is done in a single thread and finally should be correct.  
But why wrong cid is returned is interesting( calling a wrong broker? slave?).
   
   below is the logs(I changed some inner ip and brokerName info for security 
reasons)
   
   2021-03-09 20:16:19.777 INFO  [RebalanceService] 
(Slf4jLoggerFactory.java:100) - rebalanced result changed. 
allocateMessageQueueStrategyName=AVG, group=c_g, topic=t, clientId=XXX_C1, 
mqAllSize=24, cidAllSize=3, rebalanceResultSize=8, rebalanceResultSet=XXX (3 
cid, corrent)
   2021-03-09 20:16:19.779 INFO  [RebalanceService] 
(Slf4jLoggerFactory.java:100) - rebalanced result changed. 
allocateMessageQueueStrategyName=AVG, group=c_g, topic=t, clientId=XXX_C1, 
mqAllSize=24, cidAllSize=2, rebalanceResultSize=12, rebalanceResultSet=XXX (2 
cid, wrong)
   2021-03-09 20:16:19.781 INFO  [RebalanceService] 
(Slf4jLoggerFactory.java:100) - rebalanced result changed. 
allocateMessageQueueStrategyName=AVG, group=c_g, topic=t, clientId=XXX_C1, 
mqAllSize=24, cidAllSize=3, rebalanceResultSize=8, rebalanceResultSet=XXX (3 
cid, correct)
   2021-03-09 20:16:19.784 INFO  [RebalanceService] 
(Slf4jLoggerFactory.java:100) - rebalanced result changed. 
allocateMessageQueueStrategyName=AVG, group=c_g, topic=t, clientId=XXX_C1, 
mqAllSize=24, cidAllSize=2, rebalanceResultSize=12, rebalanceResultSet=XXX (2 
cid , wrong)
   2021-03-09 20:16:19.785 INFO  [RebalanceService] 
(Slf4jLoggerFactory.java:100) - rebalanced result changed. 
allocateMessageQueueStrategyName=AVG, group=c_g, topic=t, clientId=XXX_C1, 
mqAllSize=24, cidAllSize=3, rebalanceResultSize=8, rebalanceResultSet=XXX (3 
cid, correct)
   
   
   3. from rocketmq-client log the rebalance notification from broker found 
slave broker ips
   
   
   additional info: 
   in one java process we have one consumer and one producer with different 
clientId; the consumer is polling messages for one group and one topic(only one 
subscription); the producer is sending messages to many topics( different from 
consumer topic);
   
   
   2. Please tell us about your environment:
   
   RMQ 4.7.1 
   
   LitePullConsumer
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to