[ 
https://issues.apache.org/jira/browse/HDDS-421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16609700#comment-16609700
 ] 

Hudson commented on HDDS-421:
-----------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14915 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14915/])
HDDS-421. Resilient DNS resolution in datanode-service. Contributed by (elek: 
rev 317f317d4b9f8db4b55039227c7e13baac337544)
* (edit) 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/states/datanode/InitDatanodeState.java


> Resilient DNS resolution in datanode-service
> --------------------------------------------
>
>                 Key: HDDS-421
>                 URL: https://issues.apache.org/jira/browse/HDDS-421
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Datanode
>            Reporter: Elek, Marton
>            Assignee: Elek, Marton
>            Priority: Major
>             Fix For: 0.2.1, 0.3.0
>
>         Attachments: HDDS-421-ozone-0.2.1.001.patch
>
>
> When I start big clusters on kubernetes I got a very typical error:
> If the DNS of the scm is not yet available during the bootup of the datanode: 
> the datanode won't connect to the scm. It tries to reconnect but the dns 
> resolution is not repeated.
> The problem is in the InitDatanodeState.call(). It calls the getSCMAddresses 
> which creates the InetSocketAddress-es with using the hadoop utilities. 
> During the creation of the InetSocketAddress the hadoop utilities try to 
> resolve the address and save the result to the InetSocketAddress.
> The address could be unresolved, but the InitDatanodeState.call will start to 
> use it (connectionManager.addSCMServer) and there won't be any attempt to 
> resolve it later.
> My small proposal is to return immediately of any of the scm addresses is 
> unresolved and the main loop of the DatanodeStateMachine will try it again 
> (together with the DNS resolution part).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to