BUG: PIG HDF, HADOOP MAPREDUCE java.io.IOException: ..... does not exist

Ruth Garcia Mon, 25 Oct 2010 08:53:01 -0700

Hello,
I am having an error that is driving me crazy. Any help will be appreciated.

First, I have configured hadoop and hdfs according to this tutorial (Idid not created an account hadoop, used mine instead)http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29and I do not have any problem, I could run the wordcount. In otherwords, I could run the following command without problems: $ bin/hadoopjar hadoop-0.20.2-examples.jar wordcount gutenberg gutenberg-output

I have also followed the Pig tutorial herehttp://pig.apache.org/docs/r0.7.0/setup.html#Sample+Code andhttp://pig.apache.org/docs/r0.7.0/tutorial.html .


For both cases, I can run local pig scripts  without problems.

Nevertheless, with using HADOOP and PIG to run Mapreduce jobs, I havethe same error...it can not detect the file that is being loaded... Ihave put that file into the hdfs directory (the same used in thewordcount directory), I have plases the file to be load everywhere and Istill have the error that the file to be loaded "does not exist." Forsome reason, when I am using PIG it seems to me that it tries to detectfiles from a unknown directory (for me). Could someone please help mewith this issue??The error that I receive for the examplehttp://pig.apache.org/docs/r0.7.0/setup.html#Sample+Code when using :$ java -cp pig.jar:.:$HADOOPDIR idmapreduce is:

/
$ java -cp pig.jar:.:$HADOOPDIR idmapreduce

10/10/25 17:10:01 INFO executionengine.HExecutionEngine: Connecting tohadoop file system at: file:///10/10/25 17:10:01 INFO jvm.JvmMetrics: Initializing JVM Metrics withprocessName=JobTracker, sessionId=10/10/25 17:10:03 INFO jvm.JvmMetrics: Cannot initialize JVM Metricswith processName=JobTracker, sessionId= - already initialized10/10/25 17:10:03 WARN mapred.JobClient: Use GenericOptionsParser forparsing the arguments. Applications should implement Tool for the same.

**10/10/25 17:10:08 INFO mapReduceLayer.MapReduceLauncher: 0% complete

10/10/25 17:10:08 ERROR mapReduceLayer.MapReduceLauncher: Map reduce jobfailed*10/10/25 17:10:08 *ERROR mapReduceLayer.MapReduceLauncher:java.io.IOException: passwd does not exist**atorg.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:115)atorg.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)atorg.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)atorg.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:200)

   at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
   at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)

atorg.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)atorg.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)

   at java.lang.Thread.run(Thread.java:619)

/When doing the same withhttp://pig.apache.org/docs/r0.7.0/tutorial.html using :$ java -cp $HOME/pigtmp/pig.jar:$HADOOP_CONF_DIR org.apache.pig.Main$HOME/pigtmp/script1-hadooptest.pig

2010-10-25 17:14:32,651 [main] INFOorg.apache.pig.backend.hadoop.executionengine.HExecutionEngine -Connecting to hadoop file system at: file:///2010-10-25 17:14:32,815 [main] INFOorg.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics withprocessName=JobTracker, sessionId=2010-10-25 17:14:34,312 [main] INFOorg.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metricswith processName=JobTracker, sessionId= - already initialized2010-10-25 17:14:34,314 [Thread-4] WARNorg.apache.hadoop.mapred.JobClient - Use GenericOptionsParser forparsing the arguments. Applications should implement Tool for the same.**2010-10-25 17:14:39,312 [main] INFOorg.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher- 0% complete2010-10-25 17:14:39,313 [main] ERRORorg.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher- Map reduce job failed2010-10-25 17:14:39,313 [main] ERRORorg.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher- java.io.IOException: excite.log.bz2 does not exist**atorg.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:115)atorg.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)atorg.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)atorg.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:200)

   at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
   at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)

atorg.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)atorg.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)

   at java.lang.Thread.run(Thread.java:619)



Please, can someone help me??

Ruth

BUG: PIG HDF, HADOOP MAPREDUCE java.io.IOException: ..... does not exist

Reply via email to