[
https://issues.apache.org/jira/browse/HDDS-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Attila Doroszlai updated HDDS-2863:
-----------------------------------
Description:
Recon exposes SCM-like RPC endpoints on (possibly) different port than SCM.
However, when the RPC server [updates the config with the actual
address|https://github.com/apache/hadoop-ozone/blob/046a06f02783da516179ee8d8d1bed862d22f78d/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMDatanodeProtocolServer.java#L166-L168]
after startup, it does so using the SCM-specific config key. Now that Recon is
part of {{MiniOzoneCluster}}, this causes BindException in {{TestSCMRestart}}
(and possibly other integration tests):
{code:java|title=output}
2020-01-09 16:07:45,370 [main] INFO server.StorageContainerManager
(StorageContainerManager.java:start(775)) - ScmDatanodeProtocl RPC server is
listening at /0.0.0.0:36225
...
2020-01-09 16:07:51,594 [main] INFO scm.ReconStorageContainerManager
(ReconStorageContainerManager.java:start(91)) - Recon ScmDatanodeProtocol RPC
server is listening at /0.0.0.0:38907
{code}
{code:java|title=test failure}
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 33.653 s <<<
FAILURE! - in org.apache.hadoop.hdds.scm.pipeline.TestSCMRestart
org.apache.hadoop.hdds.scm.pipeline.TestSCMRestart Time elapsed: 33.642 s <<<
ERROR!
java.net.BindException: Problem binding to [0.0.0.0:38907]
...
at
org.apache.hadoop.hdds.scm.server.StorageContainerManager.startRpcServer(StorageContainerManager.java:579)
at
org.apache.hadoop.hdds.scm.server.SCMDatanodeProtocolServer.<init>(SCMDatanodeProtocolServer.java:158)
at
org.apache.hadoop.hdds.scm.server.StorageContainerManager.<init>(StorageContainerManager.java:327)
at
org.apache.hadoop.hdds.scm.server.StorageContainerManager.<init>(StorageContainerManager.java:212)
at
org.apache.hadoop.hdds.scm.server.StorageContainerManager.createSCM(StorageContainerManager.java:594)
at
org.apache.hadoop.ozone.MiniOzoneClusterImpl.restartStorageContainerManager(MiniOzoneClusterImpl.java:295)
at
org.apache.hadoop.hdds.scm.pipeline.TestSCMRestart.init(TestSCMRestart.java:78)
{code}
was:
Recon exposes SCM-like RPC endpoints on (possibly) different port than SCM.
However, when the RPC updates the config with the actual address after startup,
it does so using the SCM-specific config key. Now that Recon is part of
{{MiniOzoneCluster}}, this causes BindException in {{TestSCMRestart}} (and
possibly other integration tests):
{code:title=output}
2020-01-09 16:07:45,370 [main] INFO server.StorageContainerManager
(StorageContainerManager.java:start(775)) - ScmDatanodeProtocl RPC server is
listening at /0.0.0.0:36225
...
2020-01-09 16:07:51,594 [main] INFO scm.ReconStorageContainerManager
(ReconStorageContainerManager.java:start(91)) - Recon ScmDatanodeProtocol RPC
server is listening at /0.0.0.0:38907
{code}
{code:title=test failure}
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 33.653 s <<<
FAILURE! - in org.apache.hadoop.hdds.scm.pipeline.TestSCMRestart
org.apache.hadoop.hdds.scm.pipeline.TestSCMRestart Time elapsed: 33.642 s <<<
ERROR!
java.net.BindException: Problem binding to [0.0.0.0:38907]
...
at
org.apache.hadoop.hdds.scm.server.StorageContainerManager.startRpcServer(StorageContainerManager.java:579)
at
org.apache.hadoop.hdds.scm.server.SCMDatanodeProtocolServer.<init>(SCMDatanodeProtocolServer.java:158)
at
org.apache.hadoop.hdds.scm.server.StorageContainerManager.<init>(StorageContainerManager.java:327)
at
org.apache.hadoop.hdds.scm.server.StorageContainerManager.<init>(StorageContainerManager.java:212)
at
org.apache.hadoop.hdds.scm.server.StorageContainerManager.createSCM(StorageContainerManager.java:594)
at
org.apache.hadoop.ozone.MiniOzoneClusterImpl.restartStorageContainerManager(MiniOzoneClusterImpl.java:295)
at
org.apache.hadoop.hdds.scm.pipeline.TestSCMRestart.init(TestSCMRestart.java:78)
{code}
> BindException in TestSCMRestart
> -------------------------------
>
> Key: HDDS-2863
> URL: https://issues.apache.org/jira/browse/HDDS-2863
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Components: Ozone Recon, SCM
> Reporter: Attila Doroszlai
> Assignee: Attila Doroszlai
> Priority: Major
>
> Recon exposes SCM-like RPC endpoints on (possibly) different port than SCM.
> However, when the RPC server [updates the config with the actual
> address|https://github.com/apache/hadoop-ozone/blob/046a06f02783da516179ee8d8d1bed862d22f78d/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMDatanodeProtocolServer.java#L166-L168]
> after startup, it does so using the SCM-specific config key. Now that Recon
> is part of {{MiniOzoneCluster}}, this causes BindException in
> {{TestSCMRestart}} (and possibly other integration tests):
> {code:java|title=output}
> 2020-01-09 16:07:45,370 [main] INFO server.StorageContainerManager
> (StorageContainerManager.java:start(775)) - ScmDatanodeProtocl RPC server is
> listening at /0.0.0.0:36225
> ...
> 2020-01-09 16:07:51,594 [main] INFO scm.ReconStorageContainerManager
> (ReconStorageContainerManager.java:start(91)) - Recon ScmDatanodeProtocol RPC
> server is listening at /0.0.0.0:38907
> {code}
> {code:java|title=test failure}
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 33.653 s <<<
> FAILURE! - in org.apache.hadoop.hdds.scm.pipeline.TestSCMRestart
> org.apache.hadoop.hdds.scm.pipeline.TestSCMRestart Time elapsed: 33.642 s
> <<< ERROR!
> java.net.BindException: Problem binding to [0.0.0.0:38907]
> ...
> at
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.startRpcServer(StorageContainerManager.java:579)
> at
> org.apache.hadoop.hdds.scm.server.SCMDatanodeProtocolServer.<init>(SCMDatanodeProtocolServer.java:158)
> at
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.<init>(StorageContainerManager.java:327)
> at
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.<init>(StorageContainerManager.java:212)
> at
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.createSCM(StorageContainerManager.java:594)
> at
> org.apache.hadoop.ozone.MiniOzoneClusterImpl.restartStorageContainerManager(MiniOzoneClusterImpl.java:295)
> at
> org.apache.hadoop.hdds.scm.pipeline.TestSCMRestart.init(TestSCMRestart.java:78)
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]