RongtongJin commented on code in PR #6100:
URL: https://github.com/apache/rocketmq/pull/6100#discussion_r1127201779


##########
broker/src/main/java/org/apache/rocketmq/broker/controller/ReplicasManager.java:
##########
@@ -148,8 +178,24 @@ private boolean startBasicService() {
         }
 
         if (this.state == State.FIRST_TIME_SYNC_CONTROLLER_METADATA_DONE) {
-            if (registerBrokerToController()) {
-                LOGGER.info("First time register broker success");
+            for (int retryTimes = 0; retryTimes < 5; retryTimes++) {
+                if (register()) {
+                    LOGGER.info("First time register broker success");
+                    this.state = State.REGISTER_TO_CONTROLLER_DONE;
+                    break;
+                }
+            }
+            // register 5 times but still unsuccessful
+            if (this.state != State.REGISTER_TO_CONTROLLER_DONE) {
+                return false;
+            }
+        }
+
+        if (this.state == State.REGISTER_TO_CONTROLLER_DONE) {
+            // The scheduled task for heartbeat sending is not starting now, 
so we should manually send heartbeat request
+            this.sendHeartbeatToController();
+            if (this.masterBrokerId != null || brokerElect()) {
+                LOGGER.info("Master in this broker set is elected");

Review Comment:
   
这里存在一个并发问题。如果节点第一次启动失败(比如没有master在brokerSet中),过程中选举又成功了,通知节点变为Slave(走的notifyBrokerRoleChange)。节点第二次启动的时候这里直接跳过brokerElect,导致isIsolated一直为true,Slave节点无法正常上线。
   
![image](https://user-images.githubusercontent.com/21963954/223287056-a9a8500b-c93a-438f-b611-80a45bebed75.png)
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to