I am having this issue upgrading to 0.6. During the security group creation, the internal access rules within the cluster don't get added, consequentially blocking the datanodes from being able to connect to the namenode. I added these rules manually and it seems fine.
This bug recurs for me consistently in my environment. log ######################################################## 2011-10-04 12:23:20,949 DEBUG [jclouds.compute] (pool-3-thread-4) >> creating keyPair region(us-east-1) group(ccore27) 2011-10-04 12:23:20,949 DEBUG [jclouds.compute] (pool-3-thread-2) >> creating keyPair region(us-east-1) group(ccore27) 2011-10-04 12:23:21,414 DEBUG [jclouds.compute] (pool-3-thread-4) << created keyPair(jclouds#ccore27#us-east-1#72) 2011-10-04 12:23:21,414 DEBUG [jclouds.compute] (pool-3-thread-4) >> creating securityGroup region(us-east-1) name(jclouds#ccore27#us-east-1) 2011-10-04 12:23:21,692 DEBUG [jclouds.compute] (pool-3-thread-4) << created securityGroup(jclouds#ccore27#us-east-1) 2011-10-04 12:23:21,692 DEBUG [jclouds.compute] (pool-3-thread-4) >> authorizing securityGroup region(us-east-1) name(jclouds#ccore27#us-east-1) port(22) 2011-10-04 12:23:21,926 DEBUG [jclouds.compute] (pool-3-thread-4) << authorized securityGroup(jclouds#ccore27#us-east-1) 2011-10-04 12:23:21,926 DEBUG [jclouds.compute] (pool-3-thread-4) >> authorizing securityGroup region(us-east-1) name(jclouds#ccore27#us-east-1) permission to itself 2011-10-04 12:23:22,306 ERROR [org.apache.whirr.actions.BootstrapClusterAction] (pool-3-thread-3) Unexpected error while starting 4 nodes, minimum 4 nodes for [hadoop-datanode, hadoop-tasktracker] of cluster ccore27 java.util.concurrent.ExecutionException: java.lang.RuntimeException: request: POST https://ec2.us-east-1.amazonaws.com/ HTTP/1.1; cause: java.lang.NullPointerException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.whirr.actions.BootstrapClusterAction$StartupProcess.waitForOutcomes(BootstrapClusterAction.java:320) at org.apache.whirr.actions.BootstrapClusterAction$StartupProcess.call(BootstrapClusterAction.java:273) at org.apache.whirr.actions.BootstrapClusterAction$StartupProcess.call(BootstrapClusterAction.java:234) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680) Caused by: java.lang.RuntimeException: request: POST https://ec2.us-east-1.amazonaws.com/ HTTP/1.1; cause: java.lang.NullPointerException at org.jclouds.http.functions.ParseSax.addDetailsAndPropagate(ParseSax.java:152) at org.jclouds.http.functions.ParseSax.parse(ParseSax.java:116) at org.jclouds.http.functions.ParseSax.apply(ParseSax.java:78) at org.jclouds.http.functions.ParseSax.apply(ParseSax.java:51) at com.google.common.util.concurrent.Futures$4.apply(Futures.java:439) at com.google.common.util.concurrent.Futures$4.apply(Futures.java:437) at com.google.common.util.concurrent.Futures$ChainingListenableFuture.run(Futures.java:713) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680) at org.jclouds.http.functions.ParseSax.addDetailsAndPropagate(ParseSax.java:152) at org.jclouds.http.functions.ParseSax.apply(ParseSax.java:80) at org.jclouds.http.functions.ParseSax.apply(ParseSax.java:51) at com.google.common.util.concurrent.Futures$4.apply(Futures.java:439) at com.google.common.util.concurrent.Futures$4.apply(Futures.java:437) at com.google.common.util.concurrent.Futures$ChainingListenableFuture.run(Futures.java:713) ... 3 more ########################################################### This are the rules that existed for the cluster created with whirr 0.6 GROUP 673040621396 jclouds#ccore27#us-east-1 jclouds#ccore27#us-east-1 PERMISSION 673040621396 jclouds#ccore27#us-east-1 ALLOWS tcp 22 22 FROM CIDR 0.0.0.0/0 PERMISSION 673040621396 jclouds#ccore27#us-east-1 ALLOWS tcp 8020 8020 FROM CIDR 184.72.183.32/32 PERMISSION 673040621396 jclouds#ccore27#us-east-1 ALLOWS tcp 8021 8021 FROM CIDR 184.72.183.32/32 PERMISSION 673040621396 jclouds#ccore27#us-east-1 ALLOWS tcp 50030 50030 FROM CIDR 24.43.39.218/32 PERMISSION 673040621396 jclouds#ccore27#us-east-1 ALLOWS tcp 50070 50070 FROM CIDR 24.43.39.218/32 I had to manually add PERMISSION 673040621396 jclouds#ccore27#us-east-1 ALLOWS all FROM USER 673040621396 GRPNAME jclouds#ccore27#us-east-1 to allow the datanodes to talk to the namenode. Here is my config file with updated property names to match 0.6 ########################################## whirr.cluster-name=ccore27 whirr.instance-templates=1 hadoop-jobtracker+hadoop-namenode,4 hadoop-datanode+hadoop-tasktracker whirr.provider=aws-ec2 whirr.identity=************************* whirr.credential=********************** whirr.private-key-file=/Users/arun/.ec2/hadoopkey whirr.public-key-file=/Users/arun/.ec2/hadoopkey.pub whirr.client-cidrs=24.43.39.218/32 whirr.location-id=us-east-1 whirr.hardware-id=c1.xlarge #c1.xlarge # Ubuntu 10.04 LTS Lucid. See http://alestic.com/ or http://aws.amazon.com/amis/4348 # ebs root only whirr.image-id=us-east-1/ami-4a0df923 whirr.hadoop.install-function=install_cdh_hadoop whirr.hadoop.configure-function=configure_cdh_hadoop #######################################
