[ https://issues.apache.org/jira/browse/HADOOP-4659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647621#action_12647621 ]
Steve Loughran commented on HADOOP-4659: ---------------------------------------- full stack trace. Termination Record: HOST morzine.hpl.hp.com:rootProcess:testOrphanTracker:action:taskTracker, type: abnormal, description: Service has halted (this termination was not expected) java.io.IOException: Call to localhost/127.0.0.1:8012 failed on local exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.call(Client.java:699) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) at org.apache.hadoop.mapred.$Proxy7.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:306) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:343) at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:288) at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:453) at org.apache.hadoop.mapred.TaskTracker.innerStart(TaskTracker.java:831) at org.apache.hadoop.util.Service.start(Service.java:186) at org.smartfrog.services.hadoop.components.cluster.HadoopServiceImpl.innerDeploy(HadoopServiceImpl.java:480) at org.smartfrog.services.hadoop.components.cluster.HadoopServiceImpl.access$000(HadoopServiceImpl.java:47) at org.smartfrog.services.hadoop.components.cluster.HadoopServiceImpl$ServiceDeployerThread.execute(HadoopServiceImpl.java:630) at org.smartfrog.sfcore.utils.SmartFrogThread.run(SmartFrogThread.java:279) at org.smartfrog.sfcore.utils.WorkflowThread.run(WorkflowThread.java:117) //and the nested exception Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:299) at org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176) at org.apache.hadoop.ipc.Client.getConnection(Client.java:771) at org.apache.hadoop.ipc.Client.call(Client.java:685) > Root cause of connection failure is being lost to code that uses it for > delaying startup > ---------------------------------------------------------------------------------------- > > Key: HADOOP-4659 > URL: https://issues.apache.org/jira/browse/HADOOP-4659 > Project: Hadoop Core > Issue Type: Bug > Components: ipc > Affects Versions: 0.19.0 > Reporter: Steve Loughran > Assignee: Steve Loughran > > ipc.Client the root cause of a connection failure is being lost as the > exception is wrapped, hence the outside code, the one that looks for that > root cause, isn't working as expected. The results is you can't bring up a > task tracker before job tracker, and probably the same for a datanode before > a namenode. The change that triggered this is not yet located, I had thought > it was HADOOP-3844 but I no longer believe this is the case. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.