HI

I ve started Hadoop successfully but after loading a .txt file while performing a filter it is giving the below exception.

grunt> A= LOAD 'data/msisdn.txt' USING PigStorage();
grunt> B= FOREACH A GENERATE $0;
grunt> DUMP B;
2011-09-17 01:17:43,408 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN 2011-09-17 01:17:43,409 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - pig.usenewlogicalplan is set to true. New logical plan will be used. 2011-09-17 01:17:43,652 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: B: Store(hdfs://10.0.0.61/tmp/temp-754030090/tmp1617007250:org.apache.pig.impl.io.InterStorage) - scope-4 Operator Key: scope-4) 2011-09-17 01:17:43,662 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false 2011-09-17 01:17:43,688 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1 2011-09-17 01:17:43,689 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1 2011-09-17 01:17:43,742 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job 2011-09-17 01:17:43,754 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3 2011-09-17 01:17:46,447 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job 2011-09-17 01:17:46,609 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission. 2011-09-17 01:17:47,525 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete 2011-09-17 01:17:48,158 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job null has failed! Stop running all dependent jobs 2011-09-17 01:17:48,162 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete 2011-09-17 01:17:48,169 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input path does not exist: hdfs://10.0.0.61/user/kiranprasad.g/data/msisdn.txt 2011-09-17 01:17:48,173 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed! 2011-09-17 01:17:48,174 [main] INFO org.apache.pig.tools.pigstats.PigStats - Script Statistics:

HadoopVersion PigVersion UserId StartedAt FinishedAt Features 0.20.2 0.8.1 kiranprasad.g 2011-09-17 01:17:43 2011-09-17 01:17:48 UNKNOWN

Failed!

Failed Jobs:
JobId   Alias   Feature Message Outputs
N/A A,B MAP_ONLY Message: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input path does not exist: hdfs://10.0.0.61/user/kiranprasad.g/data/msisdn.txt at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:280) at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
       at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
       at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247) at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
       at java.lang.Thread.run(Thread.java:619)
Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://10.0.0.61/user/kiranprasad.g/data/msisdn.txt at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:224) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:241) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:268)
       ... 7 more
       hdfs://10.0.0.61/tmp/temp-754030090/tmp1617007250,

Input(s):
Failed to read data from "hdfs://10.0.0.61/user/kiranprasad.g/data/msisdn.txt"

Output(s):
Failed to produce result in "hdfs://10.0.0.61/tmp/temp-754030090/tmp1617007250"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
null


2011-09-17 01:17:48,174 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed! 2011-09-17 01:17:48,184 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias B
Details at logfile: /home/kiranprasad.g/pig-0.8.1/pig_1316202429844.log

-----Original Message----- From: kiranprasad
Sent: Friday, September 16, 2011 6:37 PM
To: hdfs-user@hadoop.apache.org
Subject: Re: While starting HDFS process getting stucked.

Thanks a lot. Hadoop cluster has started.

Regards
Kiran.G

-----Original Message----- From: Stephan Gammeter
Sent: Friday, September 16, 2011 5:35 PM
To: hdfs-user@hadoop.apache.org
Subject: Re: While starting HDFS process getting stucked.


Then you have a process bound to 10.0.0.61:8020. This means either you did
not kill your instances correctly or you cant bind to that port because it
just got
realeased recently (i think that happens some times, not 100% sure about
that), you can find out what process is listening on that port via

netstat -nltp:

example output is here:

foo@bar:~/somepath $ netstat -nltp
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
PID/Program name
tcp        0      0 0.0.0.0:2049            0.0.0.0:*
        LISTEN      -
tcp        0      0 0.0.0.0:24579           0.0.0.0:*               LISTEN
9794/skype
tcp        0      0 0.0.0.0:7               0.0.0.0:*
        LISTEN      -
tcp        0      0 0.0.0.0:36460           0.0.0.0:*
        LISTEN      -
tcp        0      0 0.0.0.0:59055           0.0.0.0:*
        LISTEN      -
tcp        0      0 0.0.0.0:111             0.0.0.0:*
        LISTEN      -
tcp        0      0 0.0.0.0:32784           0.0.0.0:*
        LISTEN      -
tcp        0      0 0.0.0.0:22              0.0.0.0:*
        LISTEN      -
tcp        0      0 0.0.0.0:53050           0.0.0.0:*
        LISTEN      -
tcp        0      0 0.0.0.0:49692           0.0.0.0:*
        LISTEN      -
tcp        0      0 0.0.0.0:48125           0.0.0.0:*
        LISTEN      -
tcp        0      0 0.0.0.0:55933           0.0.0.0:*
        LISTEN      -

you should find a line where the "Local Address" ends with :8020 and on that
line under 'PID/Program name' you can find the process id.

On 09/16/2011 01:56 PM, kiranprasad wrote:
Hi

I am getting the below error after clearing and reformating the name node.

2011-09-16 22:18:29,307 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = pig4/127.0.0.1
STARTUP_MSG: args = []
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
2011-09-16 22:18:29,408 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.net.BindException: Problem binding to /10.0.0.61:8020 : Address already in use
at org.apache.hadoop.ipc.Server.bind(Server.java:190)
at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:253)
at org.apache.hadoop.ipc.Server.<init>(Server.java:1026)
at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:488)
at org.apache.hadoop.ipc.RPC.getServer(RPC.java:450)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:191) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)
Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at org.apache.hadoop.ipc.Server.bind(Server.java:188)
... 8 more

2011-09-16 22:18:29,409 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at pig4/127.0.0.1
************************************************************/

-----Original Message----- From: Stephan Gammeter
Sent: Friday, September 16, 2011 4:54 PM
To: hdfs-user@hadoop.apache.org
Subject: Re: While starting HDFS process getting stucked.

Try everything where you stored data from hadoop data or namenodes: (should all be in /tmp/ somewhere)

On 09/16/2011 01:21 PM, kiranprasad wrote:
What do I need to clear from the hadoop directory.

-----Original Message----- From: Stephan Gammeter
Sent: Friday, September 16, 2011 3:57 PM
To: hdfs-user@hadoop.apache.org
Subject: Re: While starting HDFS process getting stucked.

Try clearing your hadoop directories and reformatting the namenode, it seemed to help in this case: (cf
http://web.archiveorange.com/archive/v/GJ8pzKvfDoYHyDQpVRSS ).

On 09/16/2011 12:21 PM, kiranprasad wrote:
I am getting this below mentioned error when I tried to start the .sh files

LOG:
=====

2011-09-16 19:51:50,310 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = pig4/127.0.0.1
STARTUP_MSG: args = []
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
2011-09-16 19:51:51,170 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=NameNode, port=8020 2011-09-16 19:51:51,197 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: 10.0.0.61/10.0.0.61:8020 2011-09-16 19:51:51,201 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode, sessionId=null 2011-09-16 19:51:51,203 INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing NameNodeMeterics using context
object:org.apache.hadoop.metrics.spi.NullContext
2011-09-16 19:51:51,474 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=kiranprasad.g,kiranprasad.g 2011-09-16 19:51:51,474 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup 2011-09-16 19:51:51,474 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true 2011-09-16 19:51:51,509 INFO org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: Initializing FSNamesystemMetrics using context
object:org.apache.hadoop.metrics.spi.NullContext
2011-09-16 19:51:51,512 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStatusMBean 2011-09-16 19:51:52,355 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed. org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected version of storage directory /tmp/hadoop-kiranprasad.g/dfs/name. Reported: -19.
Expecting = -18.
at org.apache.hadoop.hdfs.server.common.Storage.getFields(Storage.java:647) at org.apache.hadoop.hdfs.server.namenode.FSImage.getFields(FSImage.java:542) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.read(Storage.java:227) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.read(Storage.java:216) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:301) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965) 2011-09-16 19:51:52,357 INFO org.apache.hadoop.ipc.Server: Stopping server on 8020 2011-09-16 19:51:52,573 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected version of storage directory /tmp/hadoop-kiranprasad.g/dfs/name. Reported: -19. Expecting = -18. at org.apache.hadoop.hdfs.server.common.Storage.getFields(Storage.java:647) at org.apache.hadoop.hdfs.server.namenode.FSImage.getFields(FSImage.java:542) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.read(Storage.java:227) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.read(Storage.java:216) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:301) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)

2011-09-16 19:51:52,593 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at pig4/127.0.0.1
************************************************************/

Regards

Kiran.G



-----Original Message----- From: Stephan Gammeter
Sent: Friday, September 16, 2011 2:35 PM
To: hdfs-user@hadoop.apache.org
Subject: Re: While starting HDFS process getting stucked.

Are your HDFS nodes running? Did they complete the startup? What do the logs say?

On machines where /dev/random ist starved (machines with not much load and maybe VMs) i think there can be an issue with jetty (internal http server) that blocks during startup, because it wants to initialize the secure random number generator.

if you see in your datanode logs, that they get stuck upon startup:

stephaga@googolplex:/home/awesome/hadoop/hadoop $ head -n 30 logs/hadoop-awesome-datanode-bender15.log.2011-09-07 2011-09-07 16:47:11,712 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = bender15##################
STARTUP_MSG: args = []
STARTUP_MSG: version = 0.20.2-append
STARTUP_MSG: build = http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-append -r 1057313; compiled by 'awesome' on Fri Feb 18 15:36:52 CET 2011
************************************************************/
2011-09-07 16:47:19,051 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean 2011-09-07 16:47:19,054 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened info server at 50010 2011-09-07 16:47:19,057 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Balancing bandwith is 16777216 bytes/s 2011-09-07 16:47:19,118 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2011-09-07 16:47:19,191 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the
listener on 50075
2011-09-07 16:47:19,191 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50075 webServer.getConnectors()[0].getLocalPort() returned
50075
2011-09-07 16:47:19,191 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50075
2011-09-07 16:47:19,191 INFO org.mortbay.log: jetty-6.1.14
----> STUCK HERE

then try adding the following line to your "hadoop-env.sh" :

# cf: http://docs.codehaus.org/display/JETTY/Connectors+slow+to+startup
# cf: http://stackoverflow.com/questions/137212/how-to-solve-performance-problem-with-java-securerandom
export HADOOP_OPTS="-Djava.security.egd=file:/dev/./urandom"
#

hope it helps,

best,
Stephan

On 09/16/2011 10:54 AM, kiranprasad wrote:
Yes I ve formatted the namenode.
*From:* Sudharsan Sampath <mailto:sudha...@gmail.com>
*Sent:* Friday, September 16, 2011 2:11 PM
*To:* hdfs-user@hadoop.apache.org <mailto:hdfs-user@hadoop.apache.org>
*Subject:* Re: While starting HDFS process getting stucked.
Have u formatted ur namenode ?
Thanks
Sudhan S

On Fri, Sep 16, 2011 at 11:01 AM, kiranprasad <kiranprasa...@imimobile.com <mailto:kiranprasa...@imimobile.com>> wrote:

Hi

I am new to Hadoop and PIG,

For Cluster I have 3 VMs(10.0.0.61-master, 10.0.0.62,10.0.0.63 - Slaves)

I ve installed PIG in 10.0.0.61 VM.=20

Hadoop version : hadoop-0.20.2 and PIG : pig-0.8.1
I ve updated the xmls , please find the below

mapred site.xml
--------------
<configuration>
<property>
<name>mapred.job.tracker</**name>
<value>10.0.0.61:8021 <http://10.0.0.61:8021></value>
</property>
</configuration>


core-site.xml
----------
<configuration>
<property>
<name>fs.default.name <http://fs.default.name></name>
<value>hdfs://10.0.0.61:8020 <http://10.0.0.61:8020></**value>

</property>
</configuration>

Hdfs-site.xml
----------------
<configuration>
<property>
<name>fs.default.name <http://fs.default.name></name>
<value>hdfs://10.0.0.61:8020 <http://10.0.0.61:8020></**value>
</property>
<property>
<name>mapred.job.tracker</**name>
<value>10.0.0.61:8021 <http://10.0.0.61:8021></value>

</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>

masters
---------
10.0.0.61

slaves
--------

10.0.0.62
10.0.0.63


I ve tried with hadoop fs -ls but still facing the same problem.

[kiranprasad.g@pig4 hadoop-0.20.2]$ bin/start-all.sh
starting namenode, logging to

/home/kiranprasad.g/hadoop-0.20.2/bin/../logs/hadoop-kiranprasad.g-nameno=
de-pig4.out
10.0.0.62 <http://10.0.0.62>: starting datanode, logging to

/home/kiranprasad.g/hadoop-0.20.2/bin/../logs/hadoop-kiranprasad.g-datano=
de-pig3.out
10.0.0.63 <http://10.0.0.63>: starting datanode, logging to

/home/kiranprasad.g/hadoop-0.20.2/bin/../logs/hadoop-kiranprasad.g-datano=
de-pig2.out
10.0.0.61 <http://10.0.0.61>: starting secondarynamenode, logging to

/home/kiranprasad.g/hadoop-0.20.2/bin/../logs/hadoop-kiranprasad.g-second=
arynamenode-pig4.out
starting jobtracker, logging to

/home/kiranprasad.g/hadoop-0.20.2/bin/../logs/hadoop-kiranprasad.g-jobtra=
cker-pig4.out
10.0.0.63 <http://10.0.0.63>: starting tasktracker, logging to

/home/kiranprasad.g/hadoop-0.20.2/bin/../logs/hadoop-kiranprasad.g-tasktr=
acker-pig2.out
10.0.0.62 <http://10.0.0.62>: starting tasktracker, logging to

/home/kiranprasad.g/hadoop-0.20.2/bin/../logs/hadoop-kiranprasad.g-tasktr=
acker-pig3.out
[kiranprasad.g@pig4 hadoop-0.20.2]$
[kiranprasad.g@pig4 hadoop-0.20.2]$
[kiranprasad.g@pig4 hadoop-0.20.2]$ bin/hadoop fs -ls

After this it stopped running, it got stucked.

Regards
Kiran.G

Reply via email to