Posting this so others can see what's going on: I think I found it. In /tmp/stderr.log, here's some output from /tmp/setup-ec2-user.sh:
+ apt-get -y install sun-java6-jdk > Failed to fetch > http://archive.canonical.com/ubuntu/pool/partner/s/sun-java6/sun-java6-jre_6.26-2lucid1_all.deb > 404 > Not Found [IP: 91.189.88.33 80] > Failed to fetch > http://archive.canonical.com/ubuntu/pool/partner/s/sun-java6/sun-java6-bin_6.26-2lucid1_amd64.deb > 404 > Not Found [IP: 91.189.88.33 80] > Failed to fetch > http://archive.canonical.com/ubuntu/pool/partner/s/sun-java6/sun-java6-jdk_6.26-2lucid1_amd64.deb > 404 > Not Found [IP: 91.189.88.33 80] > E: Unable to fetch some archives, maybe run apt-get update or try with > --fix-missing? > + echo 'export JAVA_HOME=/usr/lib/jvm/java-6-sun' > + echo 'export JAVA_HOME=/usr/lib/jvm/java-6-sun' > + export JAVA_HOME=/usr/lib/jvm/java-6-sun > + JAVA_HOME=/usr/lib/jvm/java-6-sun > + java -version > /tmp/setup-ec2-user.sh: line 152: java: command not found Oracle just yanked the Sun JVM didn't it? On Sat, Feb 18, 2012 at 4:33 PM, Andrei Savu <[email protected]> wrote: > Tom any ideas? I will debug this tomorrow morning. > On Feb 18, 2012 11:46 PM, "Evan Pollan" <[email protected]> wrote: > >> I have that in my properties file (recall that I ran into a critical >> mapreduce bug on Ubuntu that was introduced in CDH3U3). So, it is trying >> to pull down the right bits -- i.e., I've been using cdh3u2 rather than >> cdh3u3 since u3 was released a couple of weeks ago. >> >> However, it doesn't even look like the install automation is being run at >> all. The first thing that the register_cloudera_repo function does in >> install_cdh_hadoop is cat the CDH repo information into >> /etc/apt/sources.list.d/cloudera.list. Well, that file doesn't exist on my >> systems. >> >> What is going on!? >> >> >> >> On Sat, Feb 18, 2012 at 3:37 PM, Andrei Savu <[email protected]>wrote: >> >>> Cloudera just released CDH4 - this failure may be related to that. Can >>> you try to specify whirr.env.repo=cdh3u2 ? >>> On Feb 18, 2012 11:16 PM, "Evan Pollan" <[email protected]> wrote: >>> >>>> Argh -- I just started having a similar problem with whirr-0.7.0 >>>> pulling from the cdh3u2 repo and installing the basic hadoop stack: >>>> >>>> >>>> Successfully executed configure script: [output=Reading package lists... >>>> Building dependency tree... >>>> Reading state information... >>>> Reading package lists... >>>> Building dependency tree... >>>> Reading state information... >>>> , >>>> error=/tmp/configure-hadoop-datanode_hadoop-tasktracker/configure-hadoop-datanode_hadoop-tasktracker.sh: >>>> line 73: /etc/hadoop-0.20/conf.dist/hadoop-metrics.properties: No such file >>>> or directory >>>> chgrp: invalid group: `hadoop' >>>> chgrp: invalid group: `hadoop' >>>> chgrp: invalid group: `hadoop' >>>> chmod: missing operand after `/var/log/hadoop/logs' >>>> Try `chmod --help' for more information. >>>> E: Couldn't find package hadoop-0.20-datanode >>>> hadoop-0.20-datanode: unrecognized service >>>> E: Couldn't find package hadoop-0.20-tasktracker >>>> hadoop-0.20-tasktracker: unrecognized service >>>> , exitCode=0] >>>> >>>> >>>> Same whirr config I've been using for a while -- this just started >>>> happening to me today. Three clusters in a row failed in this way. >>>> >>>> >>>> >>>> On Fri, Feb 17, 2012 at 10:49 AM, Andrei Savu <[email protected]>wrote: >>>> >>>>> The trunk should work just fine. I think in your case the download is >>>>> failing for Hadoop or for Mahout. >>>>> >>>>> >>>>> On Fri, Feb 17, 2012 at 6:33 PM, Frank Scholten < >>>>> [email protected]> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I am having trouble starting a Hadoop / Mahout cluster with Whirr >>>>>> trunk, commit 44fb39fc8. >>>>>> >>>>>> Several errors are reported. The first one is: >>>>>> >>>>>> Bootstrapping cluster >>>>>> Configuring template >>>>>> Starting 1 node(s) with roles [hadoop-jobtracker, hadoop-namenode, >>>>>> mahout-client] >>>>>> Configuring template >>>>>> Starting 4 node(s) with roles [hadoop-datanode, hadoop-tasktracker] >>>>>> Dying because - net.schmizz.sshj.transport.TransportException: Broken >>>>>> transport; encountered EOF >>>>>> Dying because - net.schmizz.sshj.transport.TransportException: Broken >>>>>> transport; encountered EOF >>>>>> << >>>>>> (ubuntu:rsa[fingerprint(af:e3:53:27:e0:12:18:54:1c:fc:3b:24:b9:18:39:10),sha1(83:6a:70:2f:c2:d5:3d:e0:05:7a:4a:e5:1a:51:67:dc:2b:56:62:18)]@ >>>>>> 50.17.130.132:22) >>>>>> error acquiring SSHClient(timeout=60000) (attempt 1 of 7): Socket >>>>>> closed >>>>>> >>>>>> This repeats several times until I get a stacktrace >>>>>> >>>>>> call get() on this exception to get access to the task in progress >>>>>> at >>>>>> org.jclouds.compute.callables.BlockUntilInitScriptStatusIsZeroThenReturnOutput.get(BlockUntilInitScriptStatusIsZeroThenReturnOutput.java:195) >>>>>> at >>>>>> org.jclouds.compute.callables.RunScriptOnNodeAsInitScriptUsingSshAndBlockUntilComplete.doCall(RunScriptOnNodeAsInitScriptUsingSshAndBlockUntilComplete.java:60) >>>>>> ... 8 more >>>>>> >>>>>> which is also repeated for several roles >>>>>> >>>>>> and at the end I get >>>>>> >>>>>> Successfully executed configure script: [output=, error=chown: invalid >>>>>> user: `hadoop:hadoop' >>>>>> cp: target `/usr/local/hadoop/conf' is not a directory >>>>>> cp: cannot create regular file `/usr/local/hadoop/conf': No such file >>>>>> or directory >>>>>> chown: invalid user: `hadoop:hadoop' >>>>>> chown: invalid user: `hadoop:hadoop' >>>>>> chown: invalid user: `hadoop:hadoop' >>>>>> Unknown id: hadoop >>>>>> Unknown id: hadoop >>>>>> , exitCode=0] >>>>>> >>>>>> for several roles. >>>>>> >>>>>> Has something changed recently that caused this problem? >>>>>> >>>>>> Cheers, >>>>>> >>>>>> Frank >>>>>> >>>>> >>>>> >>>> >>
