[ 
https://issues.apache.org/jira/browse/WHIRR-612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13435346#comment-13435346
 ] 

Tom White commented on WHIRR-612:
---------------------------------

The YARN test is failing on Rackspace with 

{noformat}
2012-08-15 12:25:24,529 INFO  ipc.Client 
(Client.java:handleConnectionFailure(683)) - Retrying connect to server: 
67-207-153-65.static.cloud-ips.com/67.207.153.65:8040. Already tried 9 time(s).
java.lang.reflect.UndeclaredThrowableException
        at 
org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
        at 
org.apache.hadoop.yarn.api.impl.pb.client.ClientRMProtocolPBClientImpl.getNewApplication(ClientRMProtocolPBClientImpl.java:134)
        at 
org.apache.hadoop.mapred.ResourceMgrDelegate.getNewJobID(ResourceMgrDelegate.java:181)
        at org.apache.hadoop.mapred.YARNRunner.getNewJobID(YARNRunner.java:214)
        at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:338)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1226)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1223)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1223)
        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1244)
        at org.apache.hadoop.examples.WordCount.main(WordCount.java:84)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
        at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:68)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed on 
local exception: java.net.SocketException: Malformed reply from SOCKS server; 
Host Details : local host is: "Clouderas-MacBook-Pro-3.local/192.168.0.188"; 
destination host is: "67-207-153-65.static.cloud-ips.com":8040; 
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:187)
        at $Proxy10.getNewApplication(Unknown Source)
        at 
org.apache.hadoop.yarn.api.impl.pb.client.ClientRMProtocolPBClientImpl.getNewApplication(ClientRMProtocolPBClientImpl.java:132)
        ... 23 more
Caused by: java.io.IOException: Failed on local exception: 
java.net.SocketException: Malformed reply from SOCKS server; Host Details : 
local host is: "Clouderas-MacBook-Pro-3.local/192.168.0.188"; destination host 
is: "67-207-153-65.static.cloud-ips.com":8040; 
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:765)
        at org.apache.hadoop.ipc.Client.call(Client.java:1165)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:184)
        ... 25 more
Caused by: java.net.SocketException: Malformed reply from SOCKS server
        at java.net.SocksSocketImpl.readSocksReply(SocksSocketImpl.java:147)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:538)
        at java.net.Socket.connect(Socket.java:529)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:522)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
        at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:472)
        at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:566)
        at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:215)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1271)
        at org.apache.hadoop.ipc.Client.call(Client.java:1141)
        ... 26 more
{noformat}

Port 8040 is the Reource Manager address, which the client talks to. The 
problem is that the client tries to talk to the public hostname (e.g. 
67-207-153-65.static.cloud-ips.com) which is resolved on the head node (over 
the SSH SOCKS tunnel) to the public IP address 67.207.153.65. However, the 
localizer is only listening on the private address so we get a connection 
refused.

In early versions of YARN the RM would listen on all interfaces, however 
MAPREDUCE-4163 changed the behaviour to listen on a single interface.

I think this works on AWS since the public hostname is resolved to the private 
IP since the resolution happens on the cluster. Rackspace doesn't have this 
behaviour, so it fails.

Since this is a limitation in YARN (and it cannot be overidden as far as I can 
see) I think we should ship 0.8.0 with this known issue, while we work out how 
to get YARN to work on Rackspace for a later release. (This will probably 
require fixes in YARN and Whirr.) We should commit this patch along with the 
whirr.env.MAPREDUCE_VERSION thing. Does that sound reasonable?


                
> CDH4 can be installed on Ubuntu now as well as CentOS
> -----------------------------------------------------
>
>                 Key: WHIRR-612
>                 URL: https://issues.apache.org/jira/browse/WHIRR-612
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/cdh
>    Affects Versions: 0.8.0
>            Reporter: Andrew Bayer
>            Assignee: Andrew Bayer
>             Fix For: 0.8.0
>
>         Attachments: cdh-yarn-cloudservers-us.txt, 
> cdh-yarn-rackspace-cloudservers-us.txt, WHIRR-612.patch, WHIRR-612.patch, 
> WHIRR-612.patch
>
>
> CDH4 beta 1 was only available on CentOS, but from beta 2 onward, CDH4 has 
> been available on Ubuntu et al. So we should remove the hardcoding in tests 
> and recipes for centos.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to