[ https://issues.apache.org/jira/browse/HADOOP-4659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steve Loughran updated HADOOP-4659: ----------------------------------- Attachment: hadoop-4659.patch Here's a merged patch that retains the same exception types as before (so the calling code does not need to look inside nested exceptions), and which contains the test. The new TestRPC test is correctly detecting failure to connect. Where we do have a problem is that on my machine (64-bit JRockit JVM on Ubuntu), I'm seeing TestFileCreationClient hang and it appears to be in these methods. Accordingly I'm not setting the patch available flag as it may cause trouble for Hudson. > Root cause of connection failure is being lost to code that uses it for > delaying startup > ---------------------------------------------------------------------------------------- > > Key: HADOOP-4659 > URL: https://issues.apache.org/jira/browse/HADOOP-4659 > Project: Hadoop Core > Issue Type: Bug > Components: ipc > Affects Versions: 0.18.3 > Reporter: Steve Loughran > Assignee: Steve Loughran > Priority: Blocker > Fix For: 0.18.3 > > Attachments: connectRetry.patch, hadoop-4659.patch, > hadoop-4659.patch, rpcConn.patch > > > ipc.Client the root cause of a connection failure is being lost as the > exception is wrapped, hence the outside code, the one that looks for that > root cause, isn't working as expected. The results is you can't bring up a > task tracker before job tracker, and probably the same for a datanode before > a namenode. The change that triggered this is not yet located, I had thought > it was HADOOP-3844 but I no longer believe this is the case. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.