[
https://issues.apache.org/jira/browse/HDDS-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826283#comment-16826283
]
Siddharth Wagle commented on HDDS-999:
--------------------------------------
Hi [~elek], with the patch we wait for 50 seconds now before failing. Are you
ok with the current patch?
> Make the DNS resolution in OzoneManager more resilient
> ------------------------------------------------------
>
> Key: HDDS-999
> URL: https://issues.apache.org/jira/browse/HDDS-999
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Components: Ozone Manager
> Reporter: Elek, Marton
> Assignee: Siddharth Wagle
> Priority: Major
> Labels: pull-request-available
> Attachments: HDDS-999.01.patch
>
> Time Spent: 1h 40m
> Remaining Estimate: 0h
>
> If the OzoneManager is started before scm the scm dns may not be available.
> In this case the om should retry and re-resolve the dns, but as of now it
> throws an exception:
> {code:java}
> 2019-01-23 17:14:25 ERROR OzoneManager:593 - Failed to start the OzoneManager.
> java.net.SocketException: Call From om-0.om to null:0 failed on socket
> exception: java.net.SocketException: Unresolved address; For more details
> see: http://wiki.apache.org/hadoop/SocketException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:798)
> at org.apache.hadoop.ipc.Server.bind(Server.java:566)
> at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:1042)
> at org.apache.hadoop.ipc.Server.<init>(Server.java:2815)
> at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:994)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.<init>(ProtobufRpcEngine.java:421)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:342)
> at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:804)
> at
> org.apache.hadoop.ozone.om.OzoneManager.startRpcServer(OzoneManager.java:563)
> at
> org.apache.hadoop.ozone.om.OzoneManager.getRpcServer(OzoneManager.java:927)
> at org.apache.hadoop.ozone.om.OzoneManager.<init>(OzoneManager.java:265)
> at org.apache.hadoop.ozone.om.OzoneManager.createOm(OzoneManager.java:674)
> at org.apache.hadoop.ozone.om.OzoneManager.main(OzoneManager.java:587)
> Caused by: java.net.SocketException: Unresolved address
> at sun.nio.ch.Net.translateToSocketException(Net.java:131)
> at sun.nio.ch.Net.translateException(Net.java:157)
> at sun.nio.ch.Net.translateException(Net.java:163)
> at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:76)
> at org.apache.hadoop.ipc.Server.bind(Server.java:549)
> ... 11 more
> Caused by: java.nio.channels.UnresolvedAddressException
> at sun.nio.ch.Net.checkAddress(Net.java:101)
> at
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:218)
> at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
> ... 12 more{code}
> It should be fixed. (See also HDDS-421 which fixed the same problem in
> datanode side and HDDS-907 which is the workaround while this issue is not
> resolved).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]