[jira] [Commented] (HDDS-421) Resilient DNS resolution in datanode-service

2018-09-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609700#comment-16609700
 ] 

Hudson commented on HDDS-421:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14915 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14915/])
HDDS-421. Resilient DNS resolution in datanode-service. Contributed by (elek: 
rev 317f317d4b9f8db4b55039227c7e13baac337544)
* (edit) 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/states/datanode/InitDatanodeState.java


> Resilient DNS resolution in datanode-service
> 
>
> Key: HDDS-421
> URL: https://issues.apache.org/jira/browse/HDDS-421
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1, 0.3.0
>
> Attachments: HDDS-421-ozone-0.2.1.001.patch
>
>
> When I start big clusters on kubernetes I got a very typical error:
> If the DNS of the scm is not yet available during the bootup of the datanode: 
> the datanode won't connect to the scm. It tries to reconnect but the dns 
> resolution is not repeated.
> The problem is in the InitDatanodeState.call(). It calls the getSCMAddresses 
> which creates the InetSocketAddress-es with using the hadoop utilities. 
> During the creation of the InetSocketAddress the hadoop utilities try to 
> resolve the address and save the result to the InetSocketAddress.
> The address could be unresolved, but the InitDatanodeState.call will start to 
> use it (connectionManager.addSCMServer) and there won't be any attempt to 
> resolve it later.
> My small proposal is to return immediately of any of the scm addresses is 
> unresolved and the main loop of the DatanodeStateMachine will try it again 
> (together with the DNS resolution part).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-421) Resilient DNS resolution in datanode-service

2018-09-10 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609652#comment-16609652
 ] 

Elek, Marton commented on HDDS-421:
---

Thanks [~anu] the review. I will take care of  the committing part and commit 
it soon...

> Resilient DNS resolution in datanode-service 
> -
>
> Key: HDDS-421
> URL: https://issues.apache.org/jira/browse/HDDS-421
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-421-ozone-0.2.1.001.patch
>
>
> When I start big clusters on kubernetes I got a very typical error:
> If the DNS of the scm is not yet available during the bootup of the datanode: 
> the datanode won't connect to the scm. It tries to reconnect but the dns 
> resolution is not repeated.
> The problem is in the InitDatanodeState.call(). It calls the getSCMAddresses 
> which creates the InetSocketAddress-es with using the hadoop utilities. 
> During the creation of the InetSocketAddress the hadoop utilities try to 
> resolve the address and save the result to the InetSocketAddress.
> The address could be unresolved, but the InitDatanodeState.call will start to 
> use it (connectionManager.addSCMServer) and there won't be any attempt to 
> resolve it later.
> My small proposal is to return immediately of any of the scm addresses is 
> unresolved and the main loop of the DatanodeStateMachine will try it again 
> (together with the DNS resolution part).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-421) Resilient DNS resolution in datanode-service

2018-09-10 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609370#comment-16609370
 ] 

Anu Engineer commented on HDDS-421:
---

+1, patch looks to me.

> Resilient DNS resolution in datanode-service 
> -
>
> Key: HDDS-421
> URL: https://issues.apache.org/jira/browse/HDDS-421
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-421-ozone-0.2.1.001.patch
>
>
> When I start big clusters on kubernetes I got a very typical error:
> If the DNS of the scm is not yet available during the bootup of the datanode: 
> the datanode won't connect to the scm. It tries to reconnect but the dns 
> resolution is not repeated.
> The problem is in the InitDatanodeState.call(). It calls the getSCMAddresses 
> which creates the InetSocketAddress-es with using the hadoop utilities. 
> During the creation of the InetSocketAddress the hadoop utilities try to 
> resolve the address and save the result to the InetSocketAddress.
> The address could be unresolved, but the InitDatanodeState.call will start to 
> use it (connectionManager.addSCMServer) and there won't be any attempt to 
> resolve it later.
> My small proposal is to return immediately of any of the scm addresses is 
> unresolved and the main loop of the DatanodeStateMachine will try it again 
> (together with the DNS resolution part).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-421) Resilient DNS resolution in datanode-service

2018-09-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16608581#comment-16608581
 ] 

Hadoop QA commented on HDDS-421:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
31s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} ozone-0.2.1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
29s{color} | {color:green} ozone-0.2.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} ozone-0.2.1 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} ozone-0.2.1 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} ozone-0.2.1 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 45s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
50s{color} | {color:green} ozone-0.2.1 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} ozone-0.2.1 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 39s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
50s{color} | {color:green} container-service in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 56m 47s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDDS-421 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12939004/HDDS-421-ozone-0.2.1.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux d442c8bb24d7 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 
07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | ozone-0.2.1 / be1ec00 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDDS-Build/1015/testReport/ |
| Max. process+thread count | 407 (vs. ulimit of 1) |
| modules | C: hadoop-hdds/container-service U: hadoop-hdds/container-service |
| Console output | 
https://builds.apache.org/job/PreCommit-HDDS-Build/1015/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> 

[jira] [Commented] (HDDS-421) Resilient DNS resolution in datanode-service

2018-09-09 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16608575#comment-16608575
 ] 

Elek, Marton commented on HDDS-421:
---

Tested with kubernetes. All of the datanodes could be started with this patch.

> Resilient DNS resolution in datanode-service 
> -
>
> Key: HDDS-421
> URL: https://issues.apache.org/jira/browse/HDDS-421
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-421-ozone-0.2.1.001.patch
>
>
> When I start big clusters on kubernetes I got a very typical error:
> If the DNS of the scm is not yet available during the bootup of the datanode: 
> the datanode won't connect to the scm. It tries to reconnect but the dns 
> resolution is not repeated.
> The problem is in the InitDatanodeState.call(). It calls the getSCMAddresses 
> which creates the InetSocketAddress-es with using the hadoop utilities. 
> During the creation of the InetSocketAddress the hadoop utilities try to 
> resolve the address and save the result to the InetSocketAddress.
> The address could be unresolved, but the InitDatanodeState.call will start to 
> use it (connectionManager.addSCMServer) and there won't be any attempt to 
> resolve it later.
> My small proposal is to return immediately of any of the scm addresses is 
> unresolved and the main loop of the DatanodeStateMachine will try it again 
> (together with the DNS resolution part).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org