[ 
https://issues.apache.org/jira/browse/HDDS-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Doroszlai updated HDDS-2863:
-----------------------------------
    Description: 
Recon exposes SCM-like RPC endpoints on (possibly) different port than SCM. 
However, when the RPC server [updates the config with the actual 
address|https://github.com/apache/hadoop-ozone/blob/046a06f02783da516179ee8d8d1bed862d22f78d/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMDatanodeProtocolServer.java#L166-L168]
 after startup, it does so using the SCM-specific config key. Now that Recon is 
part of {{MiniOzoneCluster}}, this causes BindException in {{TestSCMRestart}} 
(and possibly other integration tests):
{code:java|title=output}
2020-01-09 16:07:45,370 [main] INFO  server.StorageContainerManager 
(StorageContainerManager.java:start(775)) - ScmDatanodeProtocl RPC server is 
listening at /0.0.0.0:36225
...
2020-01-09 16:07:51,594 [main] INFO  scm.ReconStorageContainerManager 
(ReconStorageContainerManager.java:start(91)) - Recon ScmDatanodeProtocol RPC 
server is listening at /0.0.0.0:38907
{code}
{code:java|title=test failure}
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 33.653 s <<< 
FAILURE! - in org.apache.hadoop.hdds.scm.pipeline.TestSCMRestart
org.apache.hadoop.hdds.scm.pipeline.TestSCMRestart  Time elapsed: 33.642 s  <<< 
ERROR!
java.net.BindException: Problem binding to [0.0.0.0:38907]
...
        at 
org.apache.hadoop.hdds.scm.server.StorageContainerManager.startRpcServer(StorageContainerManager.java:579)
        at 
org.apache.hadoop.hdds.scm.server.SCMDatanodeProtocolServer.<init>(SCMDatanodeProtocolServer.java:158)
        at 
org.apache.hadoop.hdds.scm.server.StorageContainerManager.<init>(StorageContainerManager.java:327)
        at 
org.apache.hadoop.hdds.scm.server.StorageContainerManager.<init>(StorageContainerManager.java:212)
        at 
org.apache.hadoop.hdds.scm.server.StorageContainerManager.createSCM(StorageContainerManager.java:594)
        at 
org.apache.hadoop.ozone.MiniOzoneClusterImpl.restartStorageContainerManager(MiniOzoneClusterImpl.java:295)
        at 
org.apache.hadoop.hdds.scm.pipeline.TestSCMRestart.init(TestSCMRestart.java:78)
{code}

  was:
Recon exposes SCM-like RPC endpoints on (possibly) different port than SCM.  
However, when the RPC updates the config with the actual address after startup, 
it does so using the SCM-specific config key.  Now that Recon is part of 
{{MiniOzoneCluster}}, this causes BindException in {{TestSCMRestart}} (and 
possibly other integration tests):

{code:title=output}
2020-01-09 16:07:45,370 [main] INFO  server.StorageContainerManager 
(StorageContainerManager.java:start(775)) - ScmDatanodeProtocl RPC server is 
listening at /0.0.0.0:36225
...
2020-01-09 16:07:51,594 [main] INFO  scm.ReconStorageContainerManager 
(ReconStorageContainerManager.java:start(91)) - Recon ScmDatanodeProtocol RPC 
server is listening at /0.0.0.0:38907
{code}

{code:title=test failure}
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 33.653 s <<< 
FAILURE! - in org.apache.hadoop.hdds.scm.pipeline.TestSCMRestart
org.apache.hadoop.hdds.scm.pipeline.TestSCMRestart  Time elapsed: 33.642 s  <<< 
ERROR!
java.net.BindException: Problem binding to [0.0.0.0:38907]
...
        at 
org.apache.hadoop.hdds.scm.server.StorageContainerManager.startRpcServer(StorageContainerManager.java:579)
        at 
org.apache.hadoop.hdds.scm.server.SCMDatanodeProtocolServer.<init>(SCMDatanodeProtocolServer.java:158)
        at 
org.apache.hadoop.hdds.scm.server.StorageContainerManager.<init>(StorageContainerManager.java:327)
        at 
org.apache.hadoop.hdds.scm.server.StorageContainerManager.<init>(StorageContainerManager.java:212)
        at 
org.apache.hadoop.hdds.scm.server.StorageContainerManager.createSCM(StorageContainerManager.java:594)
        at 
org.apache.hadoop.ozone.MiniOzoneClusterImpl.restartStorageContainerManager(MiniOzoneClusterImpl.java:295)
        at 
org.apache.hadoop.hdds.scm.pipeline.TestSCMRestart.init(TestSCMRestart.java:78)
{code}


> BindException in TestSCMRestart
> -------------------------------
>
>                 Key: HDDS-2863
>                 URL: https://issues.apache.org/jira/browse/HDDS-2863
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Recon, SCM
>            Reporter: Attila Doroszlai
>            Assignee: Attila Doroszlai
>            Priority: Major
>
> Recon exposes SCM-like RPC endpoints on (possibly) different port than SCM. 
> However, when the RPC server [updates the config with the actual 
> address|https://github.com/apache/hadoop-ozone/blob/046a06f02783da516179ee8d8d1bed862d22f78d/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMDatanodeProtocolServer.java#L166-L168]
>  after startup, it does so using the SCM-specific config key. Now that Recon 
> is part of {{MiniOzoneCluster}}, this causes BindException in 
> {{TestSCMRestart}} (and possibly other integration tests):
> {code:java|title=output}
> 2020-01-09 16:07:45,370 [main] INFO  server.StorageContainerManager 
> (StorageContainerManager.java:start(775)) - ScmDatanodeProtocl RPC server is 
> listening at /0.0.0.0:36225
> ...
> 2020-01-09 16:07:51,594 [main] INFO  scm.ReconStorageContainerManager 
> (ReconStorageContainerManager.java:start(91)) - Recon ScmDatanodeProtocol RPC 
> server is listening at /0.0.0.0:38907
> {code}
> {code:java|title=test failure}
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 33.653 s <<< 
> FAILURE! - in org.apache.hadoop.hdds.scm.pipeline.TestSCMRestart
> org.apache.hadoop.hdds.scm.pipeline.TestSCMRestart  Time elapsed: 33.642 s  
> <<< ERROR!
> java.net.BindException: Problem binding to [0.0.0.0:38907]
> ...
>       at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.startRpcServer(StorageContainerManager.java:579)
>       at 
> org.apache.hadoop.hdds.scm.server.SCMDatanodeProtocolServer.<init>(SCMDatanodeProtocolServer.java:158)
>       at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.<init>(StorageContainerManager.java:327)
>       at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.<init>(StorageContainerManager.java:212)
>       at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.createSCM(StorageContainerManager.java:594)
>       at 
> org.apache.hadoop.ozone.MiniOzoneClusterImpl.restartStorageContainerManager(MiniOzoneClusterImpl.java:295)
>       at 
> org.apache.hadoop.hdds.scm.pipeline.TestSCMRestart.init(TestSCMRestart.java:78)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to