[ https://issues.apache.org/jira/browse/ROCKETMQ-184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15981030#comment-15981030 ]
ASF GitHub Bot commented on ROCKETMQ-184: ----------------------------------------- Github user coveralls commented on the issue: https://github.com/apache/incubator-rocketmq/pull/95 [![Coverage Status](https://coveralls.io/builds/11210427/badge)](https://coveralls.io/builds/11210427) Coverage increased (+0.02%) to 37.869% when pulling **40d77eaffe64cf4e3070f5e2440e7d0fa4281a0e on Jaskey:ROCKETMQ-184-slave-switch** into **6a9628b3c3e6835e37baf7b58ad9300364d4d384 on apache:develop**. > It takes too long(3-33 seconds) to switch to read from slave when master > crashes > -------------------------------------------------------------------------------- > > Key: ROCKETMQ-184 > URL: https://issues.apache.org/jira/browse/ROCKETMQ-184 > Project: Apache RocketMQ > Issue Type: Improvement > Components: rocketmq-client, rocketmq-remoting > Reporter: Jaskey Lam > Assignee: Xiaorui Wang > Fix For: 4.2.0-incubating > > > When master crashes, no notifier callback is triggered to pull message again. > Instead, it relies on the scan service to trigger timeout and then re pull. > But the pulling command has 30 seconds timeout, and after timeout, pulling > operation will be scheduled after 3 seconds. > So it takes 3 to 33 seconds to switch to slave, which is too long and can be > optimized. > The root cause is the below repull cost too long to be triggered when master > crashes > {code} > @Override > public void onException(Throwable e) { > if > (!pullRequest.getMessageQueue().getTopic().startsWith(MixAll.RETRY_GROUP_TOPIC_PREFIX)) > { > log.warn("execute the pull request exception", e); > } > > DefaultMQPushConsumerImpl.this.executePullRequestLater(pullRequest, > PULL_TIME_DELAY_MILLS_WHEN_EXCEPTION); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)