GitHub user elfwalk closed a discussion: 
docker部署5.1.1版本的controller模式,跨主机的broker主从不同步,也无法自动切换

**环境:**
1. 2台虚机:hostA 194.0.11.118、hostB 194.0.11.119
2. docker镜像:apache/rocketmq:5.1.1
3. hostA部署: nameserver+controller 3容器 + 2broker
```
#ControllerGroup        RaftGroup
#ControllerLeaderId     n1
#ControllerLeaderAddress        194.0.11.118:10082
#Peer:  n0:194.0.11.118:10081
#Peer:  n1:194.0.11.118:10082
#Peer:  n2:194.0.11.118:10083
```
4. hostB部署:  1broker

**现象:**
```
#brokerName     broker-a
#MasterBrokerId 1
#MasterAddr     194.0.11.118:10111
#MasterEpoch    3
#SyncStateSetEpoch      6
#SyncStateSetNums       2

InSyncReplica:  ReplicaIdentity{brokerName='broker-a', brokerId=1, 
brokerAddress='194.0.11.118:10111', alive=true}
InSyncReplica:  ReplicaIdentity{brokerName='broker-a', brokerId=2, 
brokerAddress='194.0.11.118:10211', alive=true}
NotInSyncReplica:       ReplicaIdentity{brokerName='broker-a', brokerId=3, 
brokerAddress='194.0.11.119:10311', alive=true}
```
```
#clusterName    DCluster
#brokerName     broker-a
#brokerAddr     194.0.11.118:10111
#brokerId       0
#Epoch: EpochEntry{epoch=1, startOffset=0, endOffset=0}

#clusterName    DCluster
#brokerName     broker-a
#brokerAddr     194.0.11.119:10311
#brokerId       3

#clusterName    DCluster
#brokerName     broker-a
#brokerAddr     194.0.11.118:10211
#brokerId       2
#Epoch: EpochEntry{epoch=1, startOffset=0, endOffset=0}
```


本来是hostA和hostB各自部署一个broker,作为主从,但hostB的broker一直无法加入SyncStateSet,导致主从无法同步,也就无法自动切换主从

不确定是否有关的报错信息:
```
ERROR FlowMonitor - Interrupted
java.lang.InterruptedException: null
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1039)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
        at 
org.apache.rocketmq.common.CountDownLatch2.await(CountDownLatch2.java:114)
        at 
org.apache.rocketmq.common.ServiceThread.waitForRunning(ServiceThread.java:117)
        at org.apache.rocketmq.store.ha.FlowMoni
```
```
ERROR NettyServerCodecThread_3 - decode exception, 194.0.11.118:38438
io.netty.handler.codec.TooLongFrameException: Adjusted frame length exceeds 
16777216: 1195725860 - discarded
        at 
io.netty.handler.codec.LengthFieldBasedFrameDecoder.fail(LengthFieldBasedFrameDecoder.java:503)
        at 
io.netty.handler.codec.LengthFieldBasedFrameDecoder.failIfNecessary(LengthFieldBasedFrameDecoder.java:489)
```
194.0.11.118:38438 这端口是随机生成的?容器没映射端口肯定不通


尝试在hostA中再部署一个broker容器后,又发现新的broker可以正常加入SyncStateSet,主从切换也正常。


**怀疑:**
brokerIP1都有配成宿主机的IP,但broker间的通信是否走容器的私有IP,这样就能解释同一台的容器通信正常,跨主机的通信失败
大家是否有碰到同样问题的?有什么方法可以查看或验证broker间的通信是否正常?




GitHub link: https://github.com/apache/rocketmq/discussions/6914

----
This is an automatically sent email for dev@rocketmq.apache.org.
To unsubscribe, please send an email to: dev-unsubscr...@rocketmq.apache.org

Reply via email to