> curl: (18) transfer closed with 22890056 bytes remaining to read I think this means that it failed to download the files from apache.org - maybe because we are trying 18 downloads at the same time.
We should consider a different strategy for distributing artefacts when starting larger clusters. Have you seen this problem before with m1.large? I don't really understand why we don't see a few retries. -- Andrei Savu On Wed, Nov 2, 2011 at 5:52 PM, Paolo Castagna < [email protected]> wrote: > Hi Andrei, > I connected to one of the instance which is not listed by the > NameNode, but it is running. > There are no Java processes running on that machine. > > This is what I see in /tmp/logs/stderr.log: > > dpkg-preconfigure: unable to re-open stdin: > sun-dlj-v1-1 license has already been accepted > sun-dlj-v1-1 license has already been accepted > sun-dlj-v1-1 license has already been accepted > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/HtmlConverter > to provide /usr/bin/HtmlConverter (HtmlConverter) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/appletviewer to > provide /usr/bin/appletviewer (appletviewer) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/apt to provide > /usr/bin/apt (apt) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/extcheck to > provide /usr/bin/extcheck (extcheck) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/idlj to provide > /usr/bin/idlj (idlj) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/jar to provide > /usr/bin/jar (jar) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/jarsigner to > provide /usr/bin/jarsigner (jarsigner) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/javac to > provide /usr/bin/javac (javac) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/javadoc to > provide /usr/bin/javadoc (javadoc) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/javah to > provide /usr/bin/javah (javah) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/javap to > provide /usr/bin/javap (javap) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/jconsole to > provide /usr/bin/jconsole (jconsole) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/jdb to provide > /usr/bin/jdb (jdb) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/jhat to provide > /usr/bin/jhat (jhat) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/jinfo to > provide /usr/bin/jinfo (jinfo) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/jmap to provide > /usr/bin/jmap (jmap) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/jps to provide > /usr/bin/jps (jps) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/jrunscript to > provide /usr/bin/jrunscript (jrunscript) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/jsadebugd to > provide /usr/bin/jsadebugd (jsadebugd) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/jstack to > provide /usr/bin/jstack (jstack) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/jstat to > provide /usr/bin/jstat (jstat) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/jstatd to > provide /usr/bin/jstatd (jstatd) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/native2ascii to > provide /usr/bin/native2ascii (native2ascii) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/rmic to provide > /usr/bin/rmic (rmic) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/schemagen to > provide /usr/bin/schemagen (schemagen) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/serialver to > provide /usr/bin/serialver (serialver) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/wsgen to > provide /usr/bin/wsgen (wsgen) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/wsimport to > provide /usr/bin/wsimport (wsimport) in auto mode. > update-alternatives: using /usr/lib/jvm/java-6-sun/bin/xjc to provide > /usr/bin/xjc (xjc) in auto mode. > java version "1.6.0_26" > Java(TM) SE Runtime Environment (build 1.6.0_26-b03) > Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode) > curl: (18) transfer closed with 22890056 bytes remaining to read > curl: (22) The requested URL returned error: 404 > > gzip: stdin: unexpected end of file > tar: Unexpected EOF in archive > tar: Unexpected EOF in archive > tar: Error is not recoverable: exiting now > > Is it in /usr/local/hadoop/bin that I should find the shell script to > start datanode and tasktracker daemons? > > That directory on this instance is empty. > > Paolo > > > On 2 November 2011 15:38, Andrei Savu <[email protected]> wrote: > > Try restarting the daemons. Are they running? Are there errors in the log > > files in /tmp? > > > > On Wed, Nov 2, 2011 at 5:34 PM, Paolo Castagna > > <[email protected]> wrote: > >> > >> Hi Andrei, > >> this cluster is still running, I am running a distcp job to copy my > >> data from S3 to HDFS. > >> > >> The NameNode (via the Web UI) is sitll reporting: > >> > >> Live Nodes : 8 > >> Dead Nodes : 0 > >> Decommissioning Nodes : 0 > >> > >> I do not see errors in the logs. > >> > >> I can try to connect to one of the machines which did not join the > >> cluster, > >> but I am not sure what to do to make it join the cluster once I am > >> connected > >> to it. > >> > >> Paolo > >> > >> On 2 November 2011 15:29, Andrei Savu <[email protected]> wrote: > >> > Are you seeing any errors in the logs? Can you check one of the > machines > >> > that failed to join the cluster? > >> > Are you sure they've tried to join the rest of the cluster? Maybe you > >> > have > >> > to wait a bit more. > >> > > >> > -- Andrei Savu > >> > > >> > On Wed, Nov 2, 2011 at 5:25 PM, Paolo Castagna > >> > <[email protected]> wrote: > >> >> > >> >> Hi > >> >> > >> >> On 2 November 2011 14:59, Paolo Castagna > >> >> <[email protected]> > >> >> wrote: > >> >> > Hi Andrei, > >> >> > I've just tried again, the only difference in the recipe: > >> >> > whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,16 > >> >> > hadoop-datanode+hadoop-tasktracker > >> >> > I saw the same exception, but now I can connect to the web UIs as > >> >> > usual. > >> >> > >> >> Well, I spoken too soon. > >> >> > >> >> The very same cluster had 17 instances, I can see all of them running > >> >> via the Amazon console (i.e. I am paying for them), however the > >> >> NameNode and the JobTracker see only 8 nodes. :-( > >> >> > >> >> Paolo > >> > > >> > > > > > >
