Re: Ubuntu Single Node Tutorial failure. No live or dead nodes.

Sonal Goyal Sat, 13 Feb 2010 08:35:56 -0800

Hi Aaron,

I am on Hadoop 0.20.0 on Ubuntu, pseudo distributed mode. If I remove the
sleep time from my start-all.sh script, my jobtracker comes up momentarily
and then dies.


Here is a capture of my commands:

sgo...@desktop:~/software/hadoop-0.20.0$ bin/hadoop namenode -format
10/02/13 21:54:19 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = desktop/127.0.1.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 0.20.0
STARTUP_MSG:   build =
https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.20 -r 763504;
compiled by 'ndaley' on Thu Apr  9 05:18:40 UTC 2009
************************************************************/
10/02/13 21:54:19 DEBUG conf.Configuration: java.io.IOException: config()
    at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:210)
    at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:197)
    at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:937)
    at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:964)

Re-format filesystem in /tmp/hadoop-sgoyal/dfs/name ? (Y or N) Y
10/02/13 21:54:22 DEBUG security.UserGroupInformation: Unix Login:
sgoyal,sgoyal,adm,dialout,cdrom,audio,plugdev,fuse,lpadmin,admin,sambashare,mysql,cvsgroup
10/02/13 21:54:22 INFO namenode.FSNamesystem:
fsOwner=sgoyal,sgoyal,adm,dialout,cdrom,audio,plugdev,fuse,lpadmin,admin,sambashare,mysql,cvsgroup
10/02/13 21:54:22 INFO namenode.FSNamesystem: supergroup=supergroup
10/02/13 21:54:22 INFO namenode.FSNamesystem: isPermissionEnabled=true
10/02/13 21:54:22 INFO common.Storage: Image file of size 96 saved in 0
seconds.
10/02/13 21:54:22 DEBUG namenode.FSNamesystem: Preallocating Edit log,
current size 0
10/02/13 21:54:22 DEBUG namenode.FSNamesystem: Edit log size is now 1049088
written 512 bytes  at offset 1048576
10/02/13 21:54:22 INFO common.Storage: Storage directory
/tmp/hadoop-sgoyal/dfs/name has been successfully formatted.
10/02/13 21:54:22 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at desktop/127.0.1.1
************************************************************/


sgo...@desktop:~/software/hadoop-0.20.0$ bin/start-all.sh
starting namenode, logging to
/home/sgoyal/software/hadoop-0.20.0/bin/../logs/hadoop-sgoyal-namenode-desktop.out
localhost: starting datanode, logging to
/home/sgoyal/software/hadoop-0.20.0/bin/../logs/hadoop-sgoyal-datanode-desktop.out
localhost: starting secondarynamenode, logging to
/home/sgoyal/software/hadoop-0.20.0/bin/../logs/hadoop-sgoyal-secondarynamenode-desktop.out
starting jobtracker, logging to
/home/sgoyal/software/hadoop-0.20.0/bin/../logs/hadoop-sgoyal-jobtracker-desktop.out
localhost: starting tasktracker, logging to
/home/sgoyal/software/hadoop-0.20.0/bin/../logs/hadoop-sgoyal-tasktracker-desktop.out

sgo...@desktop:~/software/hadoop-0.20.0$ jps
26171 Jps
26037 JobTracker
25966 SecondaryNameNode
25778 NameNode
26130 TaskTracker
25863 DataNode

sgo...@desktop:~/software/hadoop-0.20.0$ jps
26037 JobTracker
25966 SecondaryNameNode
26203 Jps
25778 NameNode
26130 TaskTracker
25863 -- process information unavailable

sgo...@desktop:~/software/hadoop-0.20.0$ jps
26239 Jps
26037 JobTracker
25966 SecondaryNameNode
25778 NameNode
26130 TaskTracker

sgo...@desktop:~/software/hadoop-0.20.0$ jps
26037 JobTracker
25966 SecondaryNameNode
25778 NameNode
26130 TaskTracker
26252 Jps

sgo...@desktop:~/software/hadoop-0.20.0$ jps
26288 Jps
25966 SecondaryNameNode
25778 NameNode

sgo...@desktop:~/software/hadoop-0.20.0$ jps
25966 SecondaryNameNode
25778 NameNode
26298 Jps

sgo...@desktop:~/software/hadoop-0.20.0$ jps
26308 Jps
25966 SecondaryNameNode
25778 NameNode

My jobtracker logs show:

2010-02-13 21:54:40,660 INFO org.apache.hadoop.mapred.JobTracker:
STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting JobTracker
STARTUP_MSG:   host = desktop/127.0.1.1
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.20.0
STARTUP_MSG:   build =
https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.20 -r 763504;
compiled by 'ndaley' on Thu Apr  9 05:18:40 UTC 2009
************************************************************/
2010-02-13 21:54:40,967 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
Initializing RPC Metrics with hostName=JobTracker, port=9001
2010-02-13 21:54:52,100 INFO org.mortbay.log: Logging to
org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
org.mortbay.log.Slf4jLog
2010-02-13 21:54:52,358 INFO org.apache.hadoop.http.HttpServer: Jetty bound
to port 50030
2010-02-13 21:54:52,359 INFO org.mortbay.log: jetty-6.1.14
2010-02-13 21:55:13,222 INFO org.mortbay.log: Started
[email protected]:50030
2010-02-13 21:55:13,227 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=JobTracker, sessionId=
2010-02-13 21:55:13,229 INFO org.apache.hadoop.mapred.JobTracker: JobTracker
up at: 9001
2010-02-13 21:55:13,229 INFO org.apache.hadoop.mapred.JobTracker: JobTracker
webserver: 50030
2010-02-13 21:55:13,942 INFO org.apache.hadoop.mapred.JobTracker: Cleaning
up the system directory
2010-02-13 21:55:14,049 INFO org.apache.hadoop.hdfs.DFSClient:
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/tmp/hadoop-sgoyal/mapred/system/jobtracker.info could only be replicated to
0 nodes, instead of 1
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1256)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

        at org.apache.hadoop.ipc.Client.call(Client.java:739)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
        at $Proxy4.addBlock(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
        at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at $Proxy4.addBlock(Unknown Source)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2873)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2755)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2046)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2232)

2010-02-13 21:55:14,049 WARN org.apache.hadoop.hdfs.DFSClient:
NotReplicatedYetException sleeping /tmp/hadoop-sgoyal/mapred/system/
jobtracker.info retries left 4
2010-02-13 21:55:14,459 INFO org.apache.hadoop.hdfs.DFSClient:
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/tmp/hadoop-sgoyal/mapred/system/jobtracker.info could only be replicated to
0 nodes, instead of 1


I suspected the dfs was not ready, and the sleep seems to solve this issue.
Look forward to hearing your take on this. Please feel free to let me know
if you need any other info.

Thanks and Regards,
Sonal


On Sat, Feb 13, 2010 at 6:40 AM, Aaron Kimball <[email protected]> wrote:

> Sonal,
>
> Can I ask why you're sleeping between starting hdfs and mapreduce? I've
> never needed this in my own code. In general, Hadoop is pretty tolerant
> about starting daemons "out of order."
>
> If you need to wait for HDFS to be ready and come out of safe mode before
> launching a job, that's another story, but you can accomplish that with:
>
> $HADOOP_HOME/hadoop dfsadmin -safemode wait
>
> ... which will block until HDFS is ready for user commands in read/write
> mode.
> - Aaron
>
>
> On Fri, Feb 12, 2010 at 8:44 AM, Sonal Goyal <[email protected]>
> wrote:
>
> > Hi
> >
> > I had faced a similar issue on Ubuntu and Hadoop 0.20 and modified the
> > start-all script to introduce a sleep time :
> >
> > bin=`dirname "$0"`
> > bin=`cd "$bin"; pwd`
> >
> > . "$bin"/hadoop-config.sh
> >
> > # start dfs daemons
> > "$bin"/start-dfs.sh --config $HADOOP_CONF_DIR
> > *echo 'sleeping'
> > sleep 60
> > echo 'awake'*
> > # start mapred daemons
> > "$bin"/start-mapred.sh --config $HADOOP_CONF_DIR
> >
> >
> > This seems to work. Please see if this works for you.
> > Thanks and Regards,
> > Sonal
> >
> >
> > On Thu, Feb 11, 2010 at 3:56 AM, E. Sammer <[email protected]> wrote:
> >
> > > On 2/10/10 5:19 PM, Nick Klosterman wrote:
> > >
> > >> @E.Sammer, no I don't *think* that it is part of another cluster. The
> > >> tutorial is for a single node cluster just as a initial set up to see
> if
> > >> you can get things up and running. I have reformatted the namenode
> > >> several times in my effort to get hadoop to work.
> > >>
> > >
> > > What I mean is that the data node, at some point, connected to your
> name
> > > node. If you reformat the name node, the data node must be wiped clean;
> > it's
> > > effectively trying to join a name node that no longer exists.
> > >
> > >
> > > --
> > > Eric Sammer
> > > [email protected]
> > > http://esammer.blogspot.com
> > >
> >
>

Re: Ubuntu Single Node Tutorial failure. No live or dead nodes.

Reply via email to