HI
I ve started Hadoop successfully but after loading a .txt file while
performing a filter it is giving the below exception.
grunt> A= LOAD 'data/msisdn.txt' USING PigStorage();
grunt> B= FOREACH A GENERATE $0;
grunt> DUMP B;
2011-09-17 01:17:43,408 [main] INFO
org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script:
UNKNOWN
2011-09-17 01:17:43,409 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
pig.usenewlogicalplan is set to true. New logical plan will be used.
2011-09-17 01:17:43,652 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: B:
Store(hdfs://10.0.0.61/tmp/temp-754030090/tmp1617007250:org.apache.pig.impl.io.InterStorage)
- scope-4 Operator Key: scope-4)
2011-09-17 01:17:43,662 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler -
File concatenation threshold: 100 optimistic? false
2011-09-17 01:17:43,688 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size before optimization: 1
2011-09-17 01:17:43,689 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size after optimization: 1
2011-09-17 01:17:43,742 [main] INFO
org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to
the job
2011-09-17 01:17:43,754 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2011-09-17 01:17:46,447 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Setting up single store job
2011-09-17 01:17:46,609 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 1 map-reduce job(s) waiting for submission.
2011-09-17 01:17:47,525 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete
2011-09-17 01:17:48,158 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- job null has failed! Stop running all dependent jobs
2011-09-17 01:17:48,162 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 100% complete
2011-09-17 01:17:48,169 [main] ERROR
org.apache.pig.tools.pigstats.PigStats - ERROR 2997: Unable to recreate
exception from backend error:
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input path
does not exist: hdfs://10.0.0.61/user/kiranprasad.g/data/msisdn.txt
2011-09-17 01:17:48,173 [main] ERROR
org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2011-09-17 01:17:48,174 [main] INFO
org.apache.pig.tools.pigstats.PigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt
Features
0.20.2 0.8.1 kiranprasad.g 2011-09-17 01:17:43 2011-09-17 01:17:48
UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
N/A A,B MAP_ONLY Message:
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input path
does not exist: hdfs://10.0.0.61/user/kiranprasad.g/data/msisdn.txt
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:280)
at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
at java.lang.Thread.run(Thread.java:619)
Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException:
Input path does not exist:
hdfs://10.0.0.61/user/kiranprasad.g/data/msisdn.txt
at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:224)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36)
at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:241)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:268)
... 7 more
hdfs://10.0.0.61/tmp/temp-754030090/tmp1617007250,
Input(s):
Failed to read data from
"hdfs://10.0.0.61/user/kiranprasad.g/data/msisdn.txt"
Output(s):
Failed to produce result in
"hdfs://10.0.0.61/tmp/temp-754030090/tmp1617007250"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
null
2011-09-17 01:17:48,174 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Failed!
2011-09-17 01:17:48,184 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 1066: Unable to open iterator for alias B
Details at logfile: /home/kiranprasad.g/pig-0.8.1/pig_1316202429844.log
-----Original Message-----
From: kiranprasad
Sent: Friday, September 16, 2011 6:37 PM
To: hdfs-user@hadoop.apache.org
Subject: Re: While starting HDFS process getting stucked.
Thanks a lot. Hadoop cluster has started.
Regards
Kiran.G
-----Original Message-----
From: Stephan Gammeter
Sent: Friday, September 16, 2011 5:35 PM
To: hdfs-user@hadoop.apache.org
Subject: Re: While starting HDFS process getting stucked.
Then you have a process bound to 10.0.0.61:8020. This means either you did
not kill your instances correctly or you cant bind to that port because it
just got
realeased recently (i think that happens some times, not 100% sure about
that), you can find out what process is listening on that port via
netstat -nltp:
example output is here:
foo@bar:~/somepath $ netstat -nltp
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
PID/Program name
tcp 0 0 0.0.0.0:2049 0.0.0.0:*
LISTEN -
tcp 0 0 0.0.0.0:24579 0.0.0.0:* LISTEN
9794/skype
tcp 0 0 0.0.0.0:7 0.0.0.0:*
LISTEN -
tcp 0 0 0.0.0.0:36460 0.0.0.0:*
LISTEN -
tcp 0 0 0.0.0.0:59055 0.0.0.0:*
LISTEN -
tcp 0 0 0.0.0.0:111 0.0.0.0:*
LISTEN -
tcp 0 0 0.0.0.0:32784 0.0.0.0:*
LISTEN -
tcp 0 0 0.0.0.0:22 0.0.0.0:*
LISTEN -
tcp 0 0 0.0.0.0:53050 0.0.0.0:*
LISTEN -
tcp 0 0 0.0.0.0:49692 0.0.0.0:*
LISTEN -
tcp 0 0 0.0.0.0:48125 0.0.0.0:*
LISTEN -
tcp 0 0 0.0.0.0:55933 0.0.0.0:*
LISTEN -
you should find a line where the "Local Address" ends with :8020 and on that
line under 'PID/Program name' you can find the process id.
On 09/16/2011 01:56 PM, kiranprasad wrote:
Hi
I am getting the below error after clearing and reformating the name node.
2011-09-16 22:18:29,307 INFO
org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = pig4/127.0.0.1
STARTUP_MSG: args = []
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build =
https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r
911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
2011-09-16 22:18:29,408 ERROR
org.apache.hadoop.hdfs.server.namenode.NameNode: java.net.BindException:
Problem binding to /10.0.0.61:8020 : Address already in use
at org.apache.hadoop.ipc.Server.bind(Server.java:190)
at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:253)
at org.apache.hadoop.ipc.Server.<init>(Server.java:1026)
at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:488)
at org.apache.hadoop.ipc.RPC.getServer(RPC.java:450)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:191)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)
Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at org.apache.hadoop.ipc.Server.bind(Server.java:188)
... 8 more
2011-09-16 22:18:29,409 INFO
org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at pig4/127.0.0.1
************************************************************/
-----Original Message----- From: Stephan Gammeter
Sent: Friday, September 16, 2011 4:54 PM
To: hdfs-user@hadoop.apache.org
Subject: Re: While starting HDFS process getting stucked.
Try everything where you stored data from hadoop data or namenodes:
(should all be in /tmp/ somewhere)
On 09/16/2011 01:21 PM, kiranprasad wrote:
What do I need to clear from the hadoop directory.
-----Original Message----- From: Stephan Gammeter
Sent: Friday, September 16, 2011 3:57 PM
To: hdfs-user@hadoop.apache.org
Subject: Re: While starting HDFS process getting stucked.
Try clearing your hadoop directories and reformatting the namenode, it
seemed to help in this case: (cf
http://web.archiveorange.com/archive/v/GJ8pzKvfDoYHyDQpVRSS ).
On 09/16/2011 12:21 PM, kiranprasad wrote:
I am getting this below mentioned error when I tried to start the .sh
files
LOG:
=====
2011-09-16 19:51:50,310 INFO
org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = pig4/127.0.0.1
STARTUP_MSG: args = []
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build =
https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r
911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
2011-09-16 19:51:51,170 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
Initializing RPC Metrics with hostName=NameNode, port=8020
2011-09-16 19:51:51,197 INFO
org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
10.0.0.61/10.0.0.61:8020
2011-09-16 19:51:51,201 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=NameNode, sessionId=null
2011-09-16 19:51:51,203 INFO
org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
Initializing NameNodeMeterics using context
object:org.apache.hadoop.metrics.spi.NullContext
2011-09-16 19:51:51,474 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
fsOwner=kiranprasad.g,kiranprasad.g
2011-09-16 19:51:51,474 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
supergroup=supergroup
2011-09-16 19:51:51,474 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
isPermissionEnabled=true
2011-09-16 19:51:51,509 INFO
org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
Initializing FSNamesystemMetrics using context
object:org.apache.hadoop.metrics.spi.NullContext
2011-09-16 19:51:51,512 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
FSNamesystemStatusMBean
2011-09-16 19:51:52,355 ERROR
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
initialization failed.
org.apache.hadoop.hdfs.server.common.IncorrectVersionException:
Unexpected version of storage directory
/tmp/hadoop-kiranprasad.g/dfs/name. Reported: -19.
Expecting = -18.
at
org.apache.hadoop.hdfs.server.common.Storage.getFields(Storage.java:647)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.getFields(FSImage.java:542)
at
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.read(Storage.java:227)
at
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.read(Storage.java:216)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:301)
at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)
2011-09-16 19:51:52,357 INFO org.apache.hadoop.ipc.Server: Stopping
server on 8020
2011-09-16 19:51:52,573 ERROR
org.apache.hadoop.hdfs.server.namenode.NameNode:
org.apache.hadoop.hdfs.server.common.IncorrectVersionException:
Unexpected
version of storage directory /tmp/hadoop-kiranprasad.g/dfs/name.
Reported: -19. Expecting = -18.
at
org.apache.hadoop.hdfs.server.common.Storage.getFields(Storage.java:647)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.getFields(FSImage.java:542)
at
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.read(Storage.java:227)
at
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.read(Storage.java:216)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:301)
at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)
2011-09-16 19:51:52,593 INFO
org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at pig4/127.0.0.1
************************************************************/
Regards
Kiran.G
-----Original Message----- From: Stephan Gammeter
Sent: Friday, September 16, 2011 2:35 PM
To: hdfs-user@hadoop.apache.org
Subject: Re: While starting HDFS process getting stucked.
Are your HDFS nodes running? Did they complete the startup? What do the
logs say?
On machines where /dev/random ist starved (machines with not much load
and maybe VMs) i think there can be an issue with jetty (internal http
server) that
blocks during startup, because it wants to initialize the secure random
number generator.
if you see in your datanode logs, that they get stuck upon startup:
stephaga@googolplex:/home/awesome/hadoop/hadoop $ head -n 30
logs/hadoop-awesome-datanode-bender15.log.2011-09-07
2011-09-07 16:47:11,712 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = bender15##################
STARTUP_MSG: args = []
STARTUP_MSG: version = 0.20.2-append
STARTUP_MSG: build =
http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-append
-r 1057313; compiled by 'awesome' on Fri Feb 18 15:36:52 CET 2011
************************************************************/
2011-09-07 16:47:19,051 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
FSDatasetStatusMBean
2011-09-07 16:47:19,054 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Opened info server at
50010
2011-09-07 16:47:19,057 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Balancing bandwith is
16777216 bytes/s
2011-09-07 16:47:19,118 INFO org.mortbay.log: Logging to
org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
org.mortbay.log.Slf4jLog
2011-09-07 16:47:19,191 INFO org.apache.hadoop.http.HttpServer: Port
returned by webServer.getConnectors()[0].getLocalPort() before open()
is -1. Opening the
listener on 50075
2011-09-07 16:47:19,191 INFO org.apache.hadoop.http.HttpServer:
listener.getLocalPort() returned 50075
webServer.getConnectors()[0].getLocalPort() returned
50075
2011-09-07 16:47:19,191 INFO org.apache.hadoop.http.HttpServer: Jetty
bound to port 50075
2011-09-07 16:47:19,191 INFO org.mortbay.log: jetty-6.1.14
----> STUCK HERE
then try adding the following line to your "hadoop-env.sh" :
# cf: http://docs.codehaus.org/display/JETTY/Connectors+slow+to+startup
# cf:
http://stackoverflow.com/questions/137212/how-to-solve-performance-problem-with-java-securerandom
export HADOOP_OPTS="-Djava.security.egd=file:/dev/./urandom"
#
hope it helps,
best,
Stephan
On 09/16/2011 10:54 AM, kiranprasad wrote:
Yes I ve formatted the namenode.
*From:* Sudharsan Sampath <mailto:sudha...@gmail.com>
*Sent:* Friday, September 16, 2011 2:11 PM
*To:* hdfs-user@hadoop.apache.org <mailto:hdfs-user@hadoop.apache.org>
*Subject:* Re: While starting HDFS process getting stucked.
Have u formatted ur namenode ?
Thanks
Sudhan S
On Fri, Sep 16, 2011 at 11:01 AM, kiranprasad
<kiranprasa...@imimobile.com <mailto:kiranprasa...@imimobile.com>>
wrote:
Hi
I am new to Hadoop and PIG,
For Cluster I have 3 VMs(10.0.0.61-master, 10.0.0.62,10.0.0.63 -
Slaves)
I ve installed PIG in 10.0.0.61 VM.=20
Hadoop version : hadoop-0.20.2 and PIG : pig-0.8.1
I ve updated the xmls , please find the below
mapred site.xml
--------------
<configuration>
<property>
<name>mapred.job.tracker</**name>
<value>10.0.0.61:8021 <http://10.0.0.61:8021></value>
</property>
</configuration>
core-site.xml
----------
<configuration>
<property>
<name>fs.default.name <http://fs.default.name></name>
<value>hdfs://10.0.0.61:8020 <http://10.0.0.61:8020></**value>
</property>
</configuration>
Hdfs-site.xml
----------------
<configuration>
<property>
<name>fs.default.name <http://fs.default.name></name>
<value>hdfs://10.0.0.61:8020 <http://10.0.0.61:8020></**value>
</property>
<property>
<name>mapred.job.tracker</**name>
<value>10.0.0.61:8021 <http://10.0.0.61:8021></value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
masters
---------
10.0.0.61
slaves
--------
10.0.0.62
10.0.0.63
I ve tried with hadoop fs -ls but still facing the same problem.
[kiranprasad.g@pig4 hadoop-0.20.2]$ bin/start-all.sh
starting namenode, logging to
/home/kiranprasad.g/hadoop-0.20.2/bin/../logs/hadoop-kiranprasad.g-nameno=
de-pig4.out
10.0.0.62 <http://10.0.0.62>: starting datanode, logging to
/home/kiranprasad.g/hadoop-0.20.2/bin/../logs/hadoop-kiranprasad.g-datano=
de-pig3.out
10.0.0.63 <http://10.0.0.63>: starting datanode, logging to
/home/kiranprasad.g/hadoop-0.20.2/bin/../logs/hadoop-kiranprasad.g-datano=
de-pig2.out
10.0.0.61 <http://10.0.0.61>: starting secondarynamenode, logging to
/home/kiranprasad.g/hadoop-0.20.2/bin/../logs/hadoop-kiranprasad.g-second=
arynamenode-pig4.out
starting jobtracker, logging to
/home/kiranprasad.g/hadoop-0.20.2/bin/../logs/hadoop-kiranprasad.g-jobtra=
cker-pig4.out
10.0.0.63 <http://10.0.0.63>: starting tasktracker, logging to
/home/kiranprasad.g/hadoop-0.20.2/bin/../logs/hadoop-kiranprasad.g-tasktr=
acker-pig2.out
10.0.0.62 <http://10.0.0.62>: starting tasktracker, logging to
/home/kiranprasad.g/hadoop-0.20.2/bin/../logs/hadoop-kiranprasad.g-tasktr=
acker-pig3.out
[kiranprasad.g@pig4 hadoop-0.20.2]$
[kiranprasad.g@pig4 hadoop-0.20.2]$
[kiranprasad.g@pig4 hadoop-0.20.2]$ bin/hadoop fs -ls
After this it stopped running, it got stucked.
Regards
Kiran.G