[ https://issues.apache.org/jira/browse/WHIRR-612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13435346#comment-13435346 ]
Tom White commented on WHIRR-612: --------------------------------- The YARN test is failing on Rackspace with {noformat} 2012-08-15 12:25:24,529 INFO ipc.Client (Client.java:handleConnectionFailure(683)) - Retrying connect to server: 67-207-153-65.static.cloud-ips.com/67.207.153.65:8040. Already tried 9 time(s). java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135) at org.apache.hadoop.yarn.api.impl.pb.client.ClientRMProtocolPBClientImpl.getNewApplication(ClientRMProtocolPBClientImpl.java:134) at org.apache.hadoop.mapred.ResourceMgrDelegate.getNewJobID(ResourceMgrDelegate.java:181) at org.apache.hadoop.mapred.YARNRunner.getNewJobID(YARNRunner.java:214) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:338) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1226) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1223) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1223) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1244) at org.apache.hadoop.examples.WordCount.main(WordCount.java:84) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:68) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed on local exception: java.net.SocketException: Malformed reply from SOCKS server; Host Details : local host is: "Clouderas-MacBook-Pro-3.local/192.168.0.188"; destination host is: "67-207-153-65.static.cloud-ips.com":8040; at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:187) at $Proxy10.getNewApplication(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ClientRMProtocolPBClientImpl.getNewApplication(ClientRMProtocolPBClientImpl.java:132) ... 23 more Caused by: java.io.IOException: Failed on local exception: java.net.SocketException: Malformed reply from SOCKS server; Host Details : local host is: "Clouderas-MacBook-Pro-3.local/192.168.0.188"; destination host is: "67-207-153-65.static.cloud-ips.com":8040; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:765) at org.apache.hadoop.ipc.Client.call(Client.java:1165) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:184) ... 25 more Caused by: java.net.SocketException: Malformed reply from SOCKS server at java.net.SocksSocketImpl.readSocksReply(SocksSocketImpl.java:147) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:538) at java.net.Socket.connect(Socket.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:522) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:472) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:566) at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:215) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1271) at org.apache.hadoop.ipc.Client.call(Client.java:1141) ... 26 more {noformat} Port 8040 is the Reource Manager address, which the client talks to. The problem is that the client tries to talk to the public hostname (e.g. 67-207-153-65.static.cloud-ips.com) which is resolved on the head node (over the SSH SOCKS tunnel) to the public IP address 67.207.153.65. However, the localizer is only listening on the private address so we get a connection refused. In early versions of YARN the RM would listen on all interfaces, however MAPREDUCE-4163 changed the behaviour to listen on a single interface. I think this works on AWS since the public hostname is resolved to the private IP since the resolution happens on the cluster. Rackspace doesn't have this behaviour, so it fails. Since this is a limitation in YARN (and it cannot be overidden as far as I can see) I think we should ship 0.8.0 with this known issue, while we work out how to get YARN to work on Rackspace for a later release. (This will probably require fixes in YARN and Whirr.) We should commit this patch along with the whirr.env.MAPREDUCE_VERSION thing. Does that sound reasonable? > CDH4 can be installed on Ubuntu now as well as CentOS > ----------------------------------------------------- > > Key: WHIRR-612 > URL: https://issues.apache.org/jira/browse/WHIRR-612 > Project: Whirr > Issue Type: Bug > Components: service/cdh > Affects Versions: 0.8.0 > Reporter: Andrew Bayer > Assignee: Andrew Bayer > Fix For: 0.8.0 > > Attachments: cdh-yarn-cloudservers-us.txt, > cdh-yarn-rackspace-cloudservers-us.txt, WHIRR-612.patch, WHIRR-612.patch, > WHIRR-612.patch > > > CDH4 beta 1 was only available on CentOS, but from beta 2 onward, CDH4 has > been available on Ubuntu et al. So we should remove the hardcoding in tests > and recipes for centos. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira