GitHub user RongtongJin added a comment to the discussion: 
docker部署5.1.1版本的controller模式,跨主机的broker主从不同步,也无法自动切换

> **环境:**
> 
> 1. 2台虚机:hostA 194.0.11.118、hostB 194.0.11.119
> 2. docker镜像:apache/rocketmq:5.1.1
> 3. hostA部署: nameserver+controller 3容器 + 2broker
> 
> ```
> #ControllerGroup      RaftGroup
> #ControllerLeaderId   n1
> #ControllerLeaderAddress      194.0.11.118:10082
> #Peer:        n0:194.0.11.118:10081
> #Peer:        n1:194.0.11.118:10082
> #Peer:        n2:194.0.11.118:10083
> ```
> 
> 4. hostB部署:  1broker
> 
> **现象:**
> 
> ```
> #brokerName   broker-a
> #MasterBrokerId       1
> #MasterAddr   194.0.11.118:10111
> #MasterEpoch  3
> #SyncStateSetEpoch    6
> #SyncStateSetNums     2
> 
> InSyncReplica:        ReplicaIdentity{brokerName='broker-a', brokerId=1, 
> brokerAddress='194.0.11.118:10111', alive=true}
> InSyncReplica:        ReplicaIdentity{brokerName='broker-a', brokerId=2, 
> brokerAddress='194.0.11.118:10211', alive=true}
> NotInSyncReplica:     ReplicaIdentity{brokerName='broker-a', brokerId=3, 
> brokerAddress='194.0.11.119:10311', alive=true}
> ```
> 
> ```
> #clusterName  DCluster
> #brokerName   broker-a
> #brokerAddr   194.0.11.118:10111
> #brokerId     0
> #Epoch: EpochEntry{epoch=1, startOffset=0, endOffset=0}
> 
> #clusterName  DCluster
> #brokerName   broker-a
> #brokerAddr   194.0.11.119:10311
> #brokerId     3
> 
> #clusterName  DCluster
> #brokerName   broker-a
> #brokerAddr   194.0.11.118:10211
> #brokerId     2
> #Epoch: EpochEntry{epoch=1, startOffset=0, endOffset=0}
> ```
> 
> 本来是hostA和hostB各自部署一个broker,作为主从,但hostB的broker一直无法加入SyncStateSet,导致主从无法同步,也就无法自动切换主从
> 
> 不确定是否有关的报错信息:
> 
> ```
> ERROR FlowMonitor - Interrupted
> java.lang.InterruptedException: null
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1039)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>         at 
> org.apache.rocketmq.common.CountDownLatch2.await(CountDownLatch2.java:114)
>         at 
> org.apache.rocketmq.common.ServiceThread.waitForRunning(ServiceThread.java:117)
>         at org.apache.rocketmq.store.ha.FlowMoni
> ```
> 
> ```
> ERROR NettyServerCodecThread_3 - decode exception, 194.0.11.118:38438
> io.netty.handler.codec.TooLongFrameException: Adjusted frame length exceeds 
> 16777216: 1195725860 - discarded
>         at 
> io.netty.handler.codec.LengthFieldBasedFrameDecoder.fail(LengthFieldBasedFrameDecoder.java:503)
>         at 
> io.netty.handler.codec.LengthFieldBasedFrameDecoder.failIfNecessary(LengthFieldBasedFrameDecoder.java:489)
> ```
> 
> 194.0.11.118:38438 这端口是随机生成的?容器没映射端口肯定不通
> 
> 尝试在hostA中再部署一个broker容器后,又发现新的broker可以正常加入SyncStateSet,主从切换也正常。
> 
> **怀疑:** brokerIP1都有配成宿主机的IP,但broker间的通信是否走容器的私有IP,这样就能解释同一台的容器通信正常,跨主机的通信失败 
> 大家是否有碰到同样问题的?有什么方法可以查看或验证broker间的通信是否正常?

broker间的通信会利用到brokerIP2,这个是否配置成宿主机IP?

GitHub link: 
https://github.com/apache/rocketmq/discussions/6914#discussioncomment-6202874

----
This is an automatically sent email for dev@rocketmq.apache.org.
To unsubscribe, please send an email to: dev-unsubscr...@rocketmq.apache.org

Reply via email to