Elek, Marton created HDDS-999:
---------------------------------
Summary: Make the DNS resolution in OzoneManager more resilient
Key: HDDS-999
URL: https://issues.apache.org/jira/browse/HDDS-999
Project: Hadoop Distributed Data Store
Issue Type: Bug
Components: Ozone Manager
Reporter: Elek, Marton
If the OzoneManager is started before scm the scm dns may not be available. In
this case the om should retry and re-resolve the dns, but as of now it throws
an exception:
{code:java}
2019-01-23 17:14:25 ERROR OzoneManager:593 - Failed to start the OzoneManager.
java.net.SocketException: Call From om-0.om to null:0 failed on socket
exception: java.net.SocketException: Unresolved address; For more details see:
http://wiki.apache.org/hadoop/SocketException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:798)
at org.apache.hadoop.ipc.Server.bind(Server.java:566)
at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:1042)
at org.apache.hadoop.ipc.Server.<init>(Server.java:2815)
at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:994)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server.<init>(ProtobufRpcEngine.java:421)
at
org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:342)
at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:804)
at
org.apache.hadoop.ozone.om.OzoneManager.startRpcServer(OzoneManager.java:563)
at
org.apache.hadoop.ozone.om.OzoneManager.getRpcServer(OzoneManager.java:927)
at org.apache.hadoop.ozone.om.OzoneManager.<init>(OzoneManager.java:265)
at org.apache.hadoop.ozone.om.OzoneManager.createOm(OzoneManager.java:674)
at org.apache.hadoop.ozone.om.OzoneManager.main(OzoneManager.java:587)
Caused by: java.net.SocketException: Unresolved address
at sun.nio.ch.Net.translateToSocketException(Net.java:131)
at sun.nio.ch.Net.translateException(Net.java:157)
at sun.nio.ch.Net.translateException(Net.java:163)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:76)
at org.apache.hadoop.ipc.Server.bind(Server.java:549)
... 11 more
Caused by: java.nio.channels.UnresolvedAddressException
at sun.nio.ch.Net.checkAddress(Net.java:101)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:218)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
... 12 more{code}
It should be fixed. (See also HDDS-421 which fixed the same problem in datanode
side and HDDS-907 which is the workaround while this issue is not resolved).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]