[ https://issues.apache.org/jira/browse/WHIRR-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240701#comment-13240701 ]
Grant Ingersoll commented on WHIRR-459: --------------------------------------- Also, from what I can tell, it's getting through the install part (creating the nodes and installing zk, but then failing in config) > DNS Failure when trying to spawn HBase cluster > ---------------------------------------------- > > Key: WHIRR-459 > URL: https://issues.apache.org/jira/browse/WHIRR-459 > Project: Whirr > Issue Type: Bug > Affects Versions: 0.7.0 > Environment: Trying to use WHirr from behind a NAT > Reporter: Akash Ashok > Attachments: WHIRR-459.patch > > > While trying to launch an HBase cluster from a system which runs behind a NAT > I get the following Exception. The cluster is spawned and then it gets > destroyed. The same when run from another EC2 instance runs fine. > bin/whirr launch-cluster --config hbase-ec2.properties > Bootstrapping cluster > Configuring template > Configuring template > Starting 1 node(s) with roles [zookeeper, hadoop-namenode, hadoop-jobtracker, > hbase-master] > Starting 2 node(s) with roles [hadoop-datanode, hadoop-tasktracker, > hbase-regionserver] > Nodes started: [[id=us-east-1/i-5890203a, providerId=i-5890203a, group=hbase, > name=hbase-5890203a, location=[id=us-east-1c, scope=ZONE, > description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], > uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, > version=10.04, arch=paravirtual, is64Bit=true, > description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], > state=RUNNING, loginPort=22, hostname=domU-12-31-39-0F-94-D1, > privateAddresses=[10.193.151.31], publicAddresses=[204.236.208.250], > hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, > processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, > size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, > type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], > [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, > isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, > durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, > device=/dev/sde, durable=false, isBootDevice=false]], > supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), > tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5890203a}, tags=[]]] > Nodes started: [[id=us-east-1/i-54902036, providerId=i-54902036, group=hbase, > name=hbase-54902036, location=[id=us-east-1c, scope=ZONE, > description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], > uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, > version=10.04, arch=paravirtual, is64Bit=true, > description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], > state=RUNNING, loginPort=22, hostname=ip-10-7-29-242, > privateAddresses=[10.7.29.242], publicAddresses=[75.101.240.254], > hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, > processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, > size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, > type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], > [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, > isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, > durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, > device=/dev/sde, durable=false, isBootDevice=false]], > supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), > tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-54902036}, tags=[]], > [id=us-east-1/i-5a902038, providerId=i-5a902038, group=hbase, > name=hbase-5a902038, location=[id=us-east-1c, scope=ZONE, > description=us-east-1c, parent=us-east-1, iso3166Codes=[US-VA], metadata={}], > uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, > version=10.04, arch=paravirtual, is64Bit=true, > description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], > state=RUNNING, loginPort=22, hostname=ip-10-108-182-53, > privateAddresses=[10.108.182.53], publicAddresses=[50.16.48.211], > hardware=[id=c1.xlarge, providerId=c1.xlarge, name=null, > processors=[[cores=8.0, speed=2.5]], ram=7168, volumes=[[id=null, type=LOCAL, > size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], [id=null, > type=LOCAL, size=420.0, device=/dev/sdb, durable=false, isBootDevice=false], > [id=null, type=LOCAL, size=420.0, device=/dev/sdc, durable=false, > isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdd, > durable=false, isBootDevice=false], [id=null, type=LOCAL, size=420.0, > device=/dev/sde, durable=false, isBootDevice=false]], > supportsImage=And(ALWAYS_TRUE,Or(isWindows(),requiresVirtualizationType(paravirtual)),ALWAYS_TRUE,is64Bit()), > tags=[]], loginUser=ubuntu, userMetadata={Name=hbase-5a902038}, tags=[]]] > Authorizing firewall ingress to [us-east-1/i-5890203a] on ports [2181] for > [122.172.0.45/32] > Unable to start the cluster. Terminating all nodes. > org.apache.whirr.net.DnsException: java.net.ConnectException: Connection > refused > at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83) > at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40) > at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112) > at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94) > at > org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58) > at > org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86) > at > org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53) > at > org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100) > at > org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109) > at > org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63) > at org.apache.whirr.cli.Main.run(Main.java:64) > at org.apache.whirr.cli.Main.main(Main.java:97) > Caused by: java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) > at org.xbill.DNS.TCPClient.connect(TCPClient.java:30) > at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118) > at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254) > at > org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95) > at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358) > at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69) > ... 11 more > Unable to load cluster state, assuming it has no running nodes. > java.io.FileNotFoundException: /home/akash/.whirr/hbase/instances (No such > file or directory) > at java.io.FileInputStream.open(Native Method) > at java.io.FileInputStream.<init>(FileInputStream.java:137) > at com.google.common.io.Files$1.getInput(Files.java:100) > at com.google.common.io.Files$1.getInput(Files.java:97) > at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91) > at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88) > at com.google.common.io.CharStreams.readLines(CharStreams.java:306) > at com.google.common.io.Files.readLines(Files.java:580) > at > org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54) > at > org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58) > at > org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143) > at > org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118) > at > org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63) > at org.apache.whirr.cli.Main.run(Main.java:64) > at org.apache.whirr.cli.Main.main(Main.java:97) > Starting to run scripts on cluster for phase destroyinstances: > Starting to run scripts on cluster for phase destroyinstances: > Finished running destroy phase scripts on all cluster instances > Destroying hbase cluster > Cluster hbase destroyed > Exception in thread "main" java.lang.RuntimeException: > org.apache.whirr.net.DnsException: java.net.ConnectException: Connection > refused > at > org.apache.whirr.ClusterController.launchCluster(ClusterController.java:125) > at > org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63) > at org.apache.whirr.cli.Main.run(Main.java:64) > at org.apache.whirr.cli.Main.main(Main.java:97) > Caused by: org.apache.whirr.net.DnsException: java.net.ConnectException: > Connection refused > at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:83) > at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:40) > at org.apache.whirr.Cluster$Instance.getPublicHostName(Cluster.java:112) > at org.apache.whirr.Cluster$Instance.getPublicAddress(Cluster.java:94) > at > org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler.doBeforeConfigure(HadoopNameNodeClusterActionHandler.java:58) > at > org.apache.whirr.service.hadoop.HadoopClusterActionHandler.beforeConfigure(HadoopClusterActionHandler.java:86) > at > org.apache.whirr.service.ClusterActionHandlerSupport.beforeAction(ClusterActionHandlerSupport.java:53) > at > org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100) > at > org.apache.whirr.ClusterController.launchCluster(ClusterController.java:109) > ... 3 more > Caused by: java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) > at org.xbill.DNS.TCPClient.connect(TCPClient.java:30) > at org.xbill.DNS.TCPClient.sendrecv(TCPClient.java:118) > at org.xbill.DNS.SimpleResolver.send(SimpleResolver.java:254) > at > org.xbill.DNS.ExtendedResolver$Resolution.start(ExtendedResolver.java:95) > at org.xbill.DNS.ExtendedResolver.send(ExtendedResolver.java:358) > at org.apache.whirr.net.FastDnsResolver.apply(FastDnsResolver.java:69) > ... 11 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira