[ 
https://issues.apache.org/jira/browse/HDDS-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16823562#comment-16823562
 ] 

Siddharth Wagle commented on HDDS-999:
--------------------------------------

I am able to easily reproduce this on the latest trunk, working on a patch to 
get HDDS-776 changes effective. 

{code}
om_1        | 2019-04-22 22:30:15 ERROR OzoneManager:888 - Failed to start the 
OzoneManager.
om_1        | java.io.IOException: Invalid host name: local host is: (unknown); 
destination host is: "scm":9863; java.net.UnknownHostException; For more 
details see:  http://wiki.apache.org/hadoop/UnknownHost
om_1        |   at 
org.apache.hadoop.hdds.scm.protocolPB.ScmBlockLocationProtocolClientSideTranslatorPB.transformServiceException(ScmBlockLocationProtocolClientSideTranslatorPB.java:173)
om_1        |   at 
org.apache.hadoop.hdds.scm.protocolPB.ScmBlockLocationProtocolClientSideTranslatorPB.getScmInfo(ScmBlockLocationProtocolClientSideTranslatorPB.java:197)
om_1        |   at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
om_1        |   at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
om_1        |   at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
om_1        |   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
om_1        |   at 
org.apache.hadoop.hdds.tracing.TraceAllMethod.invoke(TraceAllMethod.java:66)
om_1        |   at com.sun.proxy.$Proxy32.getScmInfo(Unknown Source)
om_1        |   at 
org.apache.hadoop.ozone.om.OzoneManager.<init>(OzoneManager.java:305)
om_1        |   at 
org.apache.hadoop.ozone.om.OzoneManager.createOm(OzoneManager.java:964)
om_1        |   at 
org.apache.hadoop.ozone.om.OzoneManager.main(OzoneManager.java:882)
om_1        | Caused by: java.net.UnknownHostException
om_1        |   at 
org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:450)
om_1        |   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1552)
om_1        |   at org.apache.hadoop.ipc.Client.call(Client.java:1403)
om_1        |   at org.apache.hadoop.ipc.Client.call(Client.java:1367)
om_1        |   at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
om_1        |   at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
om_1        |   at com.sun.proxy.$Proxy31.getScmInfo(Unknown Source)
om_1        |   at 
org.apache.hadoop.hdds.scm.protocolPB.ScmBlockLocationProtocolClientSideTranslatorPB.getScmInfo(ScmBlockLocationProtocolClientSideTranslatorPB.java:195)
om_1        |   ... 9 more
om_1        | 2019-04-22 22:30:15 INFO  ExitUtil:210 - Exiting with status 1: 
java.io.IOException: Invalid host name: local host is: (unknown); destination 
host is: "scm":9863; java.net.UnknownHostException; For more details see:  
http://wiki.apache.org/hadoop/UnknownHost
om_1        | 2019-04-22 22:30:15 INFO  OzoneManager:51 - SHUTDOWN_MSG:
om_1        | /************************************************************
om_1        | SHUTDOWN_MSG: Shutting down OzoneManager at 
989273176ea2/172.21.0.2
om_1        | ************************************************************/
{code}

> Make the DNS resolution in OzoneManager more resilient
> ------------------------------------------------------
>
>                 Key: HDDS-999
>                 URL: https://issues.apache.org/jira/browse/HDDS-999
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Manager
>            Reporter: Elek, Marton
>            Assignee: Siddharth Wagle
>            Priority: Major
>
> If the OzoneManager is started before scm the scm dns may not be available. 
> In this case the om should retry and re-resolve the dns, but as of now it 
> throws an exception:
> {code:java}
> 2019-01-23 17:14:25 ERROR OzoneManager:593 - Failed to start the OzoneManager.
> java.net.SocketException: Call From om-0.om to null:0 failed on socket 
> exception: java.net.SocketException: Unresolved address; For more details 
> see:  http://wiki.apache.org/hadoop/SocketException
>     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>     at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>     at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>     at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:798)
>     at org.apache.hadoop.ipc.Server.bind(Server.java:566)
>     at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:1042)
>     at org.apache.hadoop.ipc.Server.<init>(Server.java:2815)
>     at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:994)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.<init>(ProtobufRpcEngine.java:421)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:342)
>     at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:804)
>     at 
> org.apache.hadoop.ozone.om.OzoneManager.startRpcServer(OzoneManager.java:563)
>     at 
> org.apache.hadoop.ozone.om.OzoneManager.getRpcServer(OzoneManager.java:927)
>     at org.apache.hadoop.ozone.om.OzoneManager.<init>(OzoneManager.java:265)
>     at org.apache.hadoop.ozone.om.OzoneManager.createOm(OzoneManager.java:674)
>     at org.apache.hadoop.ozone.om.OzoneManager.main(OzoneManager.java:587)
> Caused by: java.net.SocketException: Unresolved address
>     at sun.nio.ch.Net.translateToSocketException(Net.java:131)
>     at sun.nio.ch.Net.translateException(Net.java:157)
>     at sun.nio.ch.Net.translateException(Net.java:163)
>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:76)
>     at org.apache.hadoop.ipc.Server.bind(Server.java:549)
>     ... 11 more
> Caused by: java.nio.channels.UnresolvedAddressException
>     at sun.nio.ch.Net.checkAddress(Net.java:101)
>     at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:218)
>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>     ... 12 more{code}
> It should be fixed. (See also HDDS-421 which fixed the same problem in 
> datanode side and HDDS-907 which is the workaround while this issue is not 
> resolved).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to