[ 
https://issues.apache.org/jira/browse/HDDS-999?focusedWorklogId=231494&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-231494
 ]

ASF GitHub Bot logged work on HDDS-999:
---------------------------------------

                Author: ASF GitHub Bot
            Created on: 23/Apr/19 16:27
            Start Date: 23/Apr/19 16:27
    Worklog Time Spent: 10m 
      Work Description: bharatviswa504 commented on pull request #758: 
HDDS-999. Make the DNS resolution in OzoneManager more resilient. (swagle)
URL: https://github.com/apache/hadoop/pull/758#discussion_r277764668
 
 

 ##########
 File path: hadoop-ozone/dist/src/main/compose/ozone-om-ha/docker-compose.yaml
 ##########
 @@ -36,7 +36,6 @@ services:
          - 9890:9872
       environment:
          ENSURE_OM_INITIALIZED: /data/metadata/om/current/VERSION
-         WAITFOR: scm:9876
 
 Review comment:
   We removed WAITFOR env usage, there are few other files where this is used, 
om-statefulset.yaml. Do we need to remove from there also?
   
   And also we are removing usage of WAITFOR, then do we need to remove the 
logic for this in docker image code?
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 231494)
    Time Spent: 0.5h  (was: 20m)

> Make the DNS resolution in OzoneManager more resilient
> ------------------------------------------------------
>
>                 Key: HDDS-999
>                 URL: https://issues.apache.org/jira/browse/HDDS-999
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Manager
>            Reporter: Elek, Marton
>            Assignee: Siddharth Wagle
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HDDS-999.01.patch
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> If the OzoneManager is started before scm the scm dns may not be available. 
> In this case the om should retry and re-resolve the dns, but as of now it 
> throws an exception:
> {code:java}
> 2019-01-23 17:14:25 ERROR OzoneManager:593 - Failed to start the OzoneManager.
> java.net.SocketException: Call From om-0.om to null:0 failed on socket 
> exception: java.net.SocketException: Unresolved address; For more details 
> see:  http://wiki.apache.org/hadoop/SocketException
>     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>     at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>     at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>     at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:798)
>     at org.apache.hadoop.ipc.Server.bind(Server.java:566)
>     at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:1042)
>     at org.apache.hadoop.ipc.Server.<init>(Server.java:2815)
>     at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:994)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.<init>(ProtobufRpcEngine.java:421)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:342)
>     at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:804)
>     at 
> org.apache.hadoop.ozone.om.OzoneManager.startRpcServer(OzoneManager.java:563)
>     at 
> org.apache.hadoop.ozone.om.OzoneManager.getRpcServer(OzoneManager.java:927)
>     at org.apache.hadoop.ozone.om.OzoneManager.<init>(OzoneManager.java:265)
>     at org.apache.hadoop.ozone.om.OzoneManager.createOm(OzoneManager.java:674)
>     at org.apache.hadoop.ozone.om.OzoneManager.main(OzoneManager.java:587)
> Caused by: java.net.SocketException: Unresolved address
>     at sun.nio.ch.Net.translateToSocketException(Net.java:131)
>     at sun.nio.ch.Net.translateException(Net.java:157)
>     at sun.nio.ch.Net.translateException(Net.java:163)
>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:76)
>     at org.apache.hadoop.ipc.Server.bind(Server.java:549)
>     ... 11 more
> Caused by: java.nio.channels.UnresolvedAddressException
>     at sun.nio.ch.Net.checkAddress(Net.java:101)
>     at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:218)
>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>     ... 12 more{code}
> It should be fixed. (See also HDDS-421 which fixed the same problem in 
> datanode side and HDDS-907 which is the workaround while this issue is not 
> resolved).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to