[ 
https://issues.apache.org/jira/browse/HADOOP-15129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16321037#comment-16321037
 ] 

genericqa commented on HADOOP-15129:
------------------------------------

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
9s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 40s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 
43s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 39s{color} | {color:orange} hadoop-common-project/hadoop-common: The patch 
generated 1 new + 174 unchanged - 0 fixed = 175 total (was 174) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 55s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
43s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 80m 33s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HADOOP-15129 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12905521/HADOOP-15129.002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 47dbfee78bbd 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 12d0645 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/13949/artifact/out/diff-checkstyle-hadoop-common-project_hadoop-common.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/13949/testReport/ |
| Max. process+thread count | 1467 (vs. ulimit of 5000) |
| modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/13949/console |
| Powered by | Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Datanode caches namenode DNS lookup failure and cannot startup
> --------------------------------------------------------------
>
>                 Key: HADOOP-15129
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15129
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: ipc
>    Affects Versions: 2.8.2
>         Environment: Google Compute Engine.
> I'm using Java 8, Debian 8, Hadoop 2.8.2.
>            Reporter: Karthik Palaniappan
>            Assignee: Karthik Palaniappan
>            Priority: Minor
>         Attachments: HADOOP-15129.001.patch, HADOOP-15129.002.patch
>
>
> On startup, the Datanode creates an InetSocketAddress to register with each 
> namenode. Though there are retries on connection failure throughout the 
> stack, the same InetSocketAddress is reused.
> InetSocketAddress is an interesting class, because it resolves DNS names to 
> IP addresses on construction, and it is never refreshed. Hadoop re-creates an 
> InetSocketAddress in some cases just in case the remote IP has changed for a 
> particular DNS name: https://issues.apache.org/jira/browse/HADOOP-7472.
> Anyway, on startup, you cna see the Datanode log: "Namenode...remains 
> unresolved" -- referring to the fact that DNS lookup failed.
> {code:java}
> 2017-11-02 16:01:55,115 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Refresh request received for nameservices: null
> 2017-11-02 16:01:55,153 WARN org.apache.hadoop.hdfs.DFSUtilClient: Namenode 
> for null remains unresolved for ID null. Check your hdfs-site.xml file to 
> ensure namenodes are configured properly.
> 2017-11-02 16:01:55,156 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Starting BPOfferServices for nameservices: <default>
> 2017-11-02 16:01:55,169 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Block pool <registering> (Datanode Uuid unassigned) service to 
> cluster-32f5-m:8020 starting to offer service
> {code}
> The Datanode then proceeds to use this unresolved address, as it may work if 
> the DN is configured to use a proxy. Since I'm not using a proxy, it forever 
> prints out this message:
> {code:java}
> 2017-12-15 00:13:40,712 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Problem connecting to server: cluster-32f5-m:8020
> 2017-12-15 00:13:45,712 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Problem connecting to server: cluster-32f5-m:8020
> 2017-12-15 00:13:50,712 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Problem connecting to server: cluster-32f5-m:8020
> 2017-12-15 00:13:55,713 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Problem connecting to server: cluster-32f5-m:8020
> 2017-12-15 00:14:00,713 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Problem connecting to server: cluster-32f5-m:8020
> {code}
> Unfortunately, the log doesn't contain the exception that triggered it, but 
> the culprit is actually in IPC Client: 
> https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java#L444.
> This line was introduced in https://issues.apache.org/jira/browse/HADOOP-487 
> to give a clear error message when somebody mispells an address.
> However, the fix in HADOOP-7472 doesn't apply here, because that code happens 
> in Client#getConnection after the Connection is constructed.
> My proposed fix (will attach a patch) is to move this exception out of the 
> constructor and into a place that will trigger HADOOP-7472's logic to 
> re-resolve addresses. If the DNS failure was temporary, this will allow the 
> connection to succeed. If not, the connection will fail after ipc client 
> retries (default 10 seconds worth of retries).
> I want to fix this in ipc client rather than just in Datanode startup, as 
> this fixes temporary DNS issues for all of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to