Reading the code about recover() according to the prompt, the message will do 
reput. Analyzing the log again, i find some about the error log.
The first error log of ReputMessageService:
2018-09-11 16:46:46.976 WARN ReputMessageService - [BUG]logic queue order maybe 
wrong, expectLogicOffset: 1050988840 currentLogicOffset: 1050988820 Topic: 
role_change QID: 4 Diff: 20
The cqOffset reputing:52549442. There is something wrong with cqOffset 52549441 
of qid 4.

>From the producer log, the phyOffset is 80549937024 of cqOffset 52549441, as 
>follow:
2018-09-11 16:37:32,006 [ INFO ] MissChecker - send msg success, 
topic:role_change, tag:1536655040000, index:399503976, result:SendResult 
[sendStatus=SLAVE_NOT_AVAILABLE, msgId=0AB314D91D3F070DEA4E3710B4E7BD18, 
offsetMsgId=0A60706900002A9F00000012C1267F80, messageQueue=MessageQueue 
[topic=role_change, brokerName=syz-00, queueId=4], queueOffset=52549441]

In the recover log broker, the max phy offset is 80549937216. As the messages 
with fixed length 192, the laste message offset is 80549937024 whose cqoffset 
is 52549441.
2018-09-11 16:44:11.292 INFO main - load over, and the max phy offset = 
80549937216

And I find some log else about this issue:
2018-09-11 16:44:11.123 ERROR main - [BUG]read total count not equals msg total 
size. totalSize=192, readTotalCount=140, bodyLen=38, topicLen=11, 
propertiesLength=0
2018-09-11 16:44:11.134 INFO main - /home/suiyuzeng/store/consumequeue/0 mkdir 
OK
2018-09-11 16:44:11.134 WARN main - found a illegal magic code 0x0
2018-09-11 16:44:11.180 INFO main - topic:role_change, queue:4, queue offset 
after truncate:52549441, origin:52549441
The last line is added by me for debug.  In truncateDirtyLogicFiles() before 
return, get the cqoffset by getMaxOffsetInQueue(). And The cqOffset should be 
52549442. 

I think the last message(cqoffset 52549441, phyoffset 80549937024)  was 
damaged. In the log, totalSize, bodyLen, topicLen are right but 
propertiesLength is wrong. As checkMessageAndReturnSize() find it abnormal and 
return false, the message is dispatched. 

      DispatchRequest dispatchRequest = 
this.checkMessageAndReturnSize(byteBuffer, checkCRCOnRecover);
      int size = dispatchRequest.getMsgSize();
      // Normal data
      if (size > 0) {
             ..........
       }

As the topic was not set in the DispatchRequest, we find the log " 
/home/suiyuzeng/store/consumequeue/0 mkdir OK ". So the message whit cqOffset 
52549441 may was not despatch to the consume queue.

In recoverAbnormally() ,only the size is checked. Should we check isSuccess as 
in recoverNormally? Truncate the messages when isSuccess is false.

[ Full content available at: https://github.com/apache/rocketmq/issues/467 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to