[GitHub] [rocketmq] baihezhuo commented on issue #1585: RMQ 4.2.0 - 4.5.2 [TIMEOUT_CLEAN_QUEUE]

GitBox Mon, 11 Nov 2019 03:48:51 -0800

baihezhuo commented on issue #1585: RMQ 4.2.0 - 4.5.2 [TIMEOUT_CLEAN_QUEUE]
URL: https://github.com/apache/rocketmq/issues/1585#issuecomment-552412411
 
 
   > 
   > 
   > @baihezhuo 
尝试一下打开transientStorePoolEnable，另外如果可以的话试下spin锁，useReentrantLockWhenPutMessage设置为flase，同时把sendMessageThreadPoolNums设置小一点，比如说5个（不要多，具体要自己调节一下），还有就是查看下具体报上述错误时候的磁盘相关情况，还有消费端的相关情况，有没有在持续消费冷数据，另外可以先看下这个[文章](https://mp.weixin.qq.com/s/1yFedcwtQ7mYcuHDvGCrqw)，
   1.尝试一下打开transientStorePoolEnable，这种情况我之前了解过，读写分离可能造成消息丢失，我们的场景有些消息不允许。
   2. 使用spin锁 同时sendMessageThreadPoolNums=5 左右这个还没有尝试，后面尝试一下；
   3. 我们在压测环境jmeter并发在500-1000线程 
对3K左右的消息进行压测，都不会出现被流控的问题，并且磁盘tps达到3000-4000左右，这个也是符合我们预期的；但是在生产环境，通过console看到broker的tps并不高，大概在200左右就会出现大量的流控日志；开始是怀疑同步复制的问题，可能是网络问题，但是经过和阿里云沟通，主从节点抓包，阿里云的小哥哥给出的结论是确实有一定的重传，但是重传的间隙非常短10ms左右，而且是程序主动发起重传的，不是由于网络抖动、不稳定等因素造成的。
   4. 生产环境确实也有适当调整waitTimeMillsInSendQueue=400 ，但是仅仅是降低了一点流控率。


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] [rocketmq] baihezhuo commented on issue #1585: RMQ 4.2.0 - 4.5.2 [TIMEOUT_CLEAN_QUEUE]

Reply via email to