[ 
https://issues.apache.org/jira/browse/HDDS-2047?focusedWorklogId=304363&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304363
 ]

ASF GitHub Bot logged work on HDDS-2047:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 30/Aug/19 16:27
            Start Date: 30/Aug/19 16:27
    Worklog Time Spent: 10m 
      Work Description: xiaoyuyao commented on pull request #1379: HDDS-2047. 
Datanodes fail to come up after 10 retries in a secure env…
URL: https://github.com/apache/hadoop/pull/1379
 
 
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 304363)
    Time Spent: 50m  (was: 40m)

> Datanodes fail to come up after 10 retries in a secure environment
> ------------------------------------------------------------------
>
>                 Key: HDDS-2047
>                 URL: https://issues.apache.org/jira/browse/HDDS-2047
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Datanode, Security
>    Affects Versions: 0.4.1
>            Reporter: Vivek Ratnavel Subramanian
>            Assignee: Xiaoyu Yao
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> {code:java}
> 10:06:36.585 PM    ERROR    HddsDatanodeService    
> Error while storing SCM signed certificate.
> java.net.ConnectException: Call From 
> jmccarthy-ozone-secure-2.vpc.cloudera.com/10.65.50.127 to 
> jmccarthy-ozone-secure-1.vpc.cloudera.com:9961 failed on connection 
> exception: java.net.ConnectException: Connection refused; For more details 
> see:  http://wiki.apache.org/hadoop/ConnectionRefused
>     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>     at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>     at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>     at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:755)
>     at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1515)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1457)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1367)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>     at com.sun.proxy.$Proxy15.getDataNodeCertificate(Unknown Source)
>     at 
> org.apache.hadoop.hdds.protocolPB.SCMSecurityProtocolClientSideTranslatorPB.getDataNodeCertificateChain(SCMSecurityProtocolClientSideTranslatorPB.java:156)
>     at 
> org.apache.hadoop.ozone.HddsDatanodeService.getSCMSignedCert(HddsDatanodeService.java:278)
>     at 
> org.apache.hadoop.ozone.HddsDatanodeService.initializeCertificateClient(HddsDatanodeService.java:248)
>     at 
> org.apache.hadoop.ozone.HddsDatanodeService.start(HddsDatanodeService.java:211)
>     at 
> org.apache.hadoop.ozone.HddsDatanodeService.start(HddsDatanodeService.java:168)
>     at 
> org.apache.hadoop.ozone.HddsDatanodeService.call(HddsDatanodeService.java:143)
>     at 
> org.apache.hadoop.ozone.HddsDatanodeService.call(HddsDatanodeService.java:70)
>     at picocli.CommandLine.execute(CommandLine.java:1173)
>     at picocli.CommandLine.access$800(CommandLine.java:141)
>     at picocli.CommandLine$RunLast.handle(CommandLine.java:1367)
>     at picocli.CommandLine$RunLast.handle(CommandLine.java:1335)
>     at 
> picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:1243)
>     at picocli.CommandLine.parseWithHandlers(CommandLine.java:1526)
>     at picocli.CommandLine.parseWithHandler(CommandLine.java:1465)
>     at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:65)
>     at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:56)
>     at 
> org.apache.hadoop.ozone.HddsDatanodeService.main(HddsDatanodeService.java:126)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>     at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
>     at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:690)
>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:794)
>     at org.apache.hadoop.ipc.Client$Connection.access$3700(Client.java:411)
>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1572)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1403)
>     ... 21 more
> {code}
> Datanodes try to get SCM signed certificate for just 10 times with interval 
> of 1 sec. When SCM takes a little longer to come up, datanodes throw an 
> exception and fail.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to