RE: Can't initialize cluster

Kevin Burton Tue, 30 Apr 2013 12:12:29 -0700

Tariq,


Thank you. I tried this and the summary of the map reduce job looks like:

 

13/04/30 14:02:35 INFO mapred.JobClient: Job complete: job_201304301251_0004

13/04/30 14:02:35 INFO mapred.JobClient: Counters: 7

13/04/30 14:02:35 INFO mapred.JobClient:   Job Counters

13/04/30 14:02:35 INFO mapred.JobClient:     Failed map tasks=1

13/04/30 14:02:35 INFO mapred.JobClient:     Launched map tasks=27

13/04/30 14:02:35 INFO mapred.JobClient:     Rack-local map tasks=27

13/04/30 14:02:35 INFO mapred.JobClient:     Total time spent by all maps in
occupied slots (ms)=151904

13/04/30 14:02:35 INFO mapred.JobClient:     Total time spent by all reduces
in occupied slots (ms)=0

13/04/30 14:02:35 INFO mapred.JobClient:     Total time spent by all maps
waiting after reserving slots (ms)=0

13/04/30 14:02:35 INFO mapred.JobClient:     Total time spent by all reduces
waiting after reserving slots (ms)=0

 

But there were a number of exceptions thrown and it seemed to take longer
than just running it standalone (I should have at least 4 machines working
on this). The exceptions are my main concern now:

 

(there were quite a few)

. . . . .

13/04/30 14:02:27 INFO mapred.JobClient: Task Id :
attempt_201304301251_0004_m_000005_1, Status : FAILED

java.io.FileNotFoundException: File
file:/home/kevin/WordCount/input/hadoop-core-2.0.0-mr1-cdh4.2.1.jar does not
exist

. . . . 

13/04/30 14:02:28 INFO mapred.JobClient: Task Id :
attempt_201304301251_0004_m_000006_1, Status : FAILED

java.io.FileNotFoundException: File
file:/home/kevin/WordCount/input/guava-11.0.2.jar does not exist

. . . .

13/04/30 14:02:28 INFO mapred.JobClient: Task Id :
attempt_201304301251_0004_m_000008_0, Status : FAILED

java.io.FileNotFoundException: File
file:/home/kevin/WordCount/input/zookeeper-3.4.5-cdh4.2.1.jar does not exist

. . . . .

13/04/30 14:02:28 INFO mapred.JobClient: Task Id :
attempt_201304301251_0004_m_000001_2, Status : FAILED

java.io.FileNotFoundException: File
file:/home/kevin/WordCount/input/tools.jar does not exist

. . . . .

13/04/30 14:02:28 INFO mapred.JobClient: Task Id :
attempt_201304301251_0004_m_000000_2, Status : FAILED

java.io.FileNotFoundException: File
file:/home/kevin/WordCount/input/Websters.txt does not exist 

. . . .

13/04/30 14:02:33 INFO mapred.JobClient: Task Id :
attempt_201304301251_0004_m_000002_2, Status : FAILED

java.io.FileNotFoundException: File
file:/home/kevin/WordCount/input/hadoop-hdfs-2.0.0-cdh4.2.1.jar does not
exist

. . . . 

13/04/30 14:02:33 INFO mapred.JobClient: Task Id :
attempt_201304301251_0004_m_000004_2, Status : FAILED

java.io.FileNotFoundException: File
file:/home/kevin/WordCount/input/hadoop-common-2.0.0-cdh4.2.1.jar does not
exist

.  . . .

13/04/30 14:02:33 INFO mapred.JobClient: Task Id :
attempt_201304301251_0004_m_000003_2, Status : FAILED

java.io.FileNotFoundException: File
file:/home/kevin/WordCount/input/core-3.1.1.jar does not exist

 

No output folder was created (probably because of the numerous errors).

 

Kevin

 

From: Mohammad Tariq [mailto:[email protected]] 
Sent: Tuesday, April 30, 2013 1:32 PM
To: Kevin Burton
Subject: Re: Can't initialize cluster

 

Hello again Kevin,

 

     Good that you are making progress. This is happening because when you
are running it as a hadoop job, it looks for the the files in HDFS and when
you run it as a job program it looks into the local FS. Use this as your
input in your code and see if it helps : 

 

file:///home/kevin/input <file:///\\home\kevin\input> 

 




Warm Regards,

Tariq

https://mtariq.jux.com/

cloudfront.blogspot.com

 

On Tue, Apr 30, 2013 at 11:36 PM, Kevin Burton <[email protected]>
wrote:

We/I are/am making progress. Now I get the error:

 

13/04/30 12:59:40 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.

13/04/30 12:59:40 INFO mapred.JobClient: Cleaning up the staging area
hdfs://devubuntu05:9000/data/hadoop/tmp/hadoop-mapred/mapred/staging/kevin/.
staging/job_201304301251_0003

13/04/30 12:59:40 ERROR security.UserGroupInformation:
PriviledgedActionException as:kevin (auth:SIMPLE)
cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input
path does not exist: hdfs://devubuntu05:9000/user/kevin/input

Exception in thread "main"
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does
not exist: hdfs://devubuntu05:9000/user/kevin/input

 

When I run it with java -jar the input and output is the local folder. When
running it with hadoop jar it seems to be expecting the folders (input and
output) to be on the HDFS file system. I am not sure why these two methods
of invocation don't make the same file system assumptions.

 

It is

 

hadoop jar WordCount.jar input output (which gives the above exception)

 

versus

 

java -jar WordCount.jar input output (which outputs the wordcount statistics
to the output folder)

 

This is run in the local /home/kevin/WordCount folder.

 

Kevin

 

From: Mohammad Tariq [mailto:[email protected]] 
Sent: Tuesday, April 30, 2013 12:33 PM
To: [email protected]
Subject: Re: Can't initialize cluster

 

Set "HADOOP_MAPRED_HOME" in your hadoop-env.sh file and re-run the job. See
if it helps.




Warm Regards,

Tariq

https://mtariq.jux.com/

cloudfront.blogspot.com

 

On Tue, Apr 30, 2013 at 10:10 PM, Kevin Burton <[email protected]>
wrote:

To be clear when this code is run with 'java -jar' it runs without
exception. The exception occurs when I run with 'hadoop jar'.

 

From: Kevin Burton [mailto:[email protected]] 
Sent: Tuesday, April 30, 2013 11:36 AM
To: [email protected]
Subject: Can't initialize cluster

 

I have a simple MapReduce job that I am trying to get to run on my cluster.
When I run it I get:

 

13/04/30 11:27:45 INFO mapreduce.Cluster: Failed to use
org.apache.hadoop.mapred.LocalClientProtocolProvider due to error: Invalid
"mapreduce.jobtracker.address" configuration value for LocalJobRunner :
"devubuntu05:9001"

13/04/30 11:27:45 ERROR security.UserGroupInformation:
PriviledgedActionException as:kevin (auth:SIMPLE) cause:java.io.IOException:
Cannot initialize Cluster. Please check your configuration for
mapreduce.framework.name and the correspond server addresses.

Exception in thread "main" java.io.IOException: Cannot initialize Cluster.
Please check your configuration for mapreduce.framework.name and the
correspond server addresses.

 

My core-site.xml looks like:

 

<property>

  <name>fs.default.name</name>

  <value>hdfs://devubuntu05:9000</value>

  <description>The name of the default file system. A URI whose scheme and
authority determine the FileSystem implementation. </description>

</property>

 

So I am unclear as to why it is looking at devubuntu05:9001?

 

Here is the code:

 

    public static void WordCount( String[] args )  throws Exception {

        Configuration conf = new Configuration();

        String[] otherArgs = new GenericOptionsParser(conf,
args).getRemainingArgs();

        if (otherArgs.length != 2) {

            System.err.println("Usage: wordcount <in> <out>");

            System.exit(2);

        }

        Job job = new Job(conf, "word count");

        job.setJarByClass(WordCount.class);

        job.setMapperClass(WordCount.TokenizerMapper.class);

        job.setCombinerClass(WordCount.IntSumReducer.class);

        job.setReducerClass(WordCount.IntSumReducer.class);

        job.setOutputKeyClass(Text.class);

        job.setOutputValueClass(IntWritable.class);

 
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.addInputPath(job, new
Path(otherArgs[0]));

 
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.setOutputPath(job,
new Path(otherArgs[1]));

        System.exit(job.waitForCompletion(true) ? 0 : 1);

 

Ideas?

RE: Can't initialize cluster

Reply via email to