iamzhoug37 opened a new issue #249: master-slave sync model performance improve
URL: https://github.com/apache/rocketmq/issues/249
 
 
   **FEATURE REQUEST**
   
   1. the preformance of SYNC_MASTER model can be improved by decoupling the 
wait of slave broker fetch the message and processor threads
   
   2. In some highly message reliability situation like orders or finance 
system, only SYNC_MASTER model is allowed. After I made a performance test for 
rocketmq,I saw the monitor data showed that the maximum number of messages per 
minute is 300000, and once I added the producer client number, the avg duration 
will increase doubled
   3. After I read the broker's source code of processing produce request, I 
think this process can be improved by decoupling the wait of slave broker fetch 
the message and processor threads. My understanding is that:
        1. producer client send a produce request to the broker
      2. broker allocate a processor thread process this request, after 
SendMessageProcessor、DefaultMessageStore、CommitLog's process, the message wrote 
to the local disk 
      3. after writing to local disk, the process thread handleHA: if the 
broker configure SYNC_MASTER model, the request will be packaged to a 
GroupCommitRequest, put the GroupCommitRequest into the queue of 
GroupTransferService,  then the process thread begin waiting(max 5s)
      4. GroupTransferService is a Independent thread, this thread will 
constantly  check if any request can be responded(timeout or the message's 
offset less than or equals to the offset of slave broker).Once a request can be 
responsed,GroupTransferService thread will notify the wait process thread to 
response client
      5. client receive the response form broker
   
      
     if the 3.4 step slave broker fetch message slower slightly due to the 
delay of network or the delay of slave broker's disk write, the master broker's 
process thread will cost longer time to wait, then the master broker's 
throughput will reduce.
   
   4. problem optimize method:
        1. after 3.2 step, the process should return instand of waiting for the 
slave broker fetch messgae
        2. the work of waiting for slave broker can be given to the 
GroupTransferService thread. GroupTransferService data structure change like 
this:
   old data structure:
   `private final WaitNotifyObject notifyTransferObject = new 
WaitNotifyObject();`
   `private volatile List<CommitLog.GroupCommitRequest> requestsWrite = new 
ArrayList<>();`
   `private volatile List<CommitLog.GroupCommitRequest> requestsRead = new 
ArrayList<>();`
   new data structure:
   `private ConcurrentSkipListMap<Long , CommitLog.GroupCommitRequest> 
groupCommitRequestConcurrentSkipListMap = new ConcurrentSkipListMap<>()`
   The work process also has change like this:
   GroupTransferService iterate the skip list constantly to check if any 
request can be responsed(timeout or slave has fetched the message). If any 
request can be responsed, the GroupTransferService thread will response the 
request.
         3. in 3.3 step, before the request put to the skip list, check the 
push2SlaveMaxOffset is greated than the need offset of request. If greated, the 
request will be response immediately.
   5. simple optimize result:
   before optimize:
   300000 messages pre minute, avg cost 10 ms
   
![image](https://user-images.githubusercontent.com/21154201/37750593-de90ad12-2dc8-11e8-96ca-356ab0da23b0.png)
   after optimize:
   5000000 message pre minute, avg cost 1.5ms
   
![image](https://user-images.githubusercontent.com/21154201/37750596-e1fc1df6-2dc8-11e8-889a-5a98887a52a0.png)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to