[
https://issues.apache.org/jira/browse/HDDS-7879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17683349#comment-17683349
]
Attila Doroszlai commented on HDDS-7879:
----------------------------------------
[~szetszwo], it seems this started appearing right after the merge of
Ozone-Streaming feature branch (2022/12/16).
{{NettyServerStreamRpc}} gets random port, which may conflict with the
configured gRPC port:
{code:title=https://github.com/adoroszlai/ozone-build-results/blob/master/2023/02/01/19852/it-scm/hadoop-ozone/integration-test/org.apache.hadoop.ozone.scm.TestStorageContainerManagerHA.txt}
2023-02-01 21:53:35,619 [Listener at 127.0.0.1/35987] INFO grpc.GrpcConfigKeys
(ConfUtils.java:logGet(46)) - raft.grpc.server.port = 37507 (custom)
...
2023-02-01 21:53:35,625
[5846a44b-842e-49d3-8750-e0fae6c03dfc-NettyServerStreamRpc-bossGroup--thread1]
INFO logging.LoggingHandler (AbstractInternalLogger.java:log(148)) - [id:
0x6df49c76] BIND: 0.0.0.0/0.0.0.0:0
2023-02-01 21:53:35,625
[5846a44b-842e-49d3-8750-e0fae6c03dfc-NettyServerStreamRpc-bossGroup--thread1]
INFO logging.LoggingHandler (AbstractInternalLogger.java:log(148)) - [id:
0x6df49c76, L:/0:0:0:0:0:0:0:0:37507] ACTIVE
...
2023-02-01 21:53:35,721 [Listener at 0.0.0.0/40331] INFO server.RaftServer
(RaftServerProxy.java:startImpl(393)) - 5846a44b-842e-49d3-8750-e0fae6c03dfc:
start RPC server
2023-02-01 21:53:35,722 [Listener at 0.0.0.0/40331] ERROR server.GrpcService
(ExitUtils.java:terminate(133)) - Terminating with exit status 1: Failed to
start Grpc server
java.io.IOException: Failed to bind to address 0.0.0.0/0.0.0.0:37507
at
org.apache.ratis.thirdparty.io.grpc.netty.NettyServer.start(NettyServer.java:328)
at
org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl.start(ServerImpl.java:183)
at
org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl.start(ServerImpl.java:92)
at
org.apache.ratis.grpc.server.GrpcService.startImpl(GrpcService.java:258)
at
org.apache.ratis.util.LifeCycle.startAndTransition(LifeCycle.java:270)
at
org.apache.ratis.server.RaftServerRpcWithProxy.start(RaftServerRpcWithProxy.java:72)
{code}
It seems {{NettyServerStreamRpc}} is bound during construction, while
{{GrpcService}} is bound only when it is started. Would it make sense to defer
binding of the streaming server to its {{start()}} method?
> Intermittent BindException in HA integration tests
> --------------------------------------------------
>
> Key: HDDS-7879
> URL: https://issues.apache.org/jira/browse/HDDS-7879
> Project: Apache Ozone
> Issue Type: Sub-task
> Components: test
> Affects Versions: 1.4.0
> Reporter: Attila Doroszlai
> Priority: Major
>
> Some HA integration tests intermittently fail to start OM/SCM service due to
> port conflicts:
> {code:title=OM}
> org.apache.ratis.util.ExitUtils$ExitException: Failed to start Grpc server
> at org.apache.ratis.util.ExitUtils.terminate(ExitUtils.java:141)
> at org.apache.ratis.util.ExitUtils.terminate(ExitUtils.java:151)
> at
> org.apache.ratis.grpc.server.GrpcService.startImpl(GrpcService.java:260)
> at
> org.apache.ratis.util.LifeCycle.startAndTransition(LifeCycle.java:270)
> at
> org.apache.ratis.server.RaftServerRpcWithProxy.start(RaftServerRpcWithProxy.java:72)
> at
> org.apache.ratis.server.impl.RaftServerProxy.startImpl(RaftServerProxy.java:394)
> at
> org.apache.ratis.util.LifeCycle.startAndTransition(LifeCycle.java:270)
> at
> org.apache.ratis.server.impl.RaftServerProxy.start(RaftServerProxy.java:387)
> at
> org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.start(OzoneManagerRatisServer.java:557)
> at org.apache.hadoop.ozone.om.OzoneManager.start(OzoneManager.java:1513)
> at
> org.apache.hadoop.ozone.MiniOzoneHAClusterImpl$Builder.createOMService(MiniOzoneHAClusterImpl.java:525)
> at
> org.apache.hadoop.ozone.MiniOzoneHAClusterImpl$Builder.build(MiniOzoneHAClusterImpl.java:426)
> {code}
> {code:title=SCM}
> org.apache.ratis.util.ExitUtils$ExitException: Failed to start Grpc server
> at org.apache.ratis.util.ExitUtils.terminate(ExitUtils.java:141)
> at org.apache.ratis.util.ExitUtils.terminate(ExitUtils.java:151)
> at
> org.apache.ratis.grpc.server.GrpcService.startImpl(GrpcService.java:260)
> at
> org.apache.ratis.util.LifeCycle.startAndTransition(LifeCycle.java:270)
> at
> org.apache.ratis.server.RaftServerRpcWithProxy.start(RaftServerRpcWithProxy.java:72)
> at
> org.apache.ratis.server.impl.RaftServerProxy.startImpl(RaftServerProxy.java:394)
> at
> org.apache.ratis.util.LifeCycle.startAndTransition(LifeCycle.java:270)
> at
> org.apache.ratis.server.impl.RaftServerProxy.start(RaftServerProxy.java:387)
> at
> org.apache.hadoop.hdds.scm.ha.SCMRatisServerImpl.start(SCMRatisServerImpl.java:179)
> at
> org.apache.hadoop.hdds.scm.ha.SCMHAManagerImpl.start(SCMHAManagerImpl.java:102)
> at
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.start(StorageContainerManager.java:1445)
> at
> org.apache.hadoop.ozone.MiniOzoneHAClusterImpl$Builder.createSCMService(MiniOzoneHAClusterImpl.java:604)
> at
> org.apache.hadoop.ozone.MiniOzoneHAClusterImpl$Builder.build(MiniOzoneHAClusterImpl.java:425)
> {code}
> *
> https://github.com/adoroszlai/ozone-build-results/blob/master/2022/12/16/19133/it-om/hadoop-ozone/integration-test/org.apache.hadoop.ozone.om.TestOMBucketLayoutUpgrade.txt
> *
> https://github.com/adoroszlai/ozone-build-results/blob/master/2023/01/05/19362/it-scm/hadoop-ozone/integration-test/org.apache.hadoop.ozone.scm.TestStorageContainerManagerHA.txt
> *
> https://github.com/adoroszlai/ozone-build-results/blob/master/2023/01/10/19472/it-ozone/hadoop-ozone/integration-test/org.apache.hadoop.ozone.shell.TestOzoneShellHA.txt
> *
> https://github.com/adoroszlai/ozone-build-results/blob/master/2023/01/11/19484/it-ozone/hadoop-ozone/integration-test/org.apache.hadoop.ozone.TestMiniOzoneOMHACluster.txt
> *
> https://github.com/adoroszlai/ozone-build-results/blob/master/2023/01/12/19510/it-scm/hadoop-ozone/integration-test/org.apache.hadoop.ozone.scm.TestFailoverWithSCMHA.txt
> *
> https://github.com/adoroszlai/ozone-build-results/blob/master/2023/02/01/19852/it-scm/hadoop-ozone/integration-test/org.apache.hadoop.ozone.scm.TestStorageContainerManagerHA.txt
> *
> https://github.com/adoroszlai/ozone-build-results/blob/master/2023/02/02/19862/it-om/hadoop-ozone/integration-test/org.apache.hadoop.ozone.om.TestOMRatisSnapshots.txt
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]