Re: Trouble with Word Count example

Sandeep Jangra Thu, 29 Nov 2012 19:32:10 -0800

It Worked finally !!!
It worked with both old and new libraries.

The problem was that hadoop did not have read permissions on my run.jar
available on local filesystem.
I discovered this when out of pure desperation I tried to copy the jar from
local onto hdfs and it failed saying file not found.
Then I changed permissions and yes I graduated from hello world !!


Thanks everyone for your inputs.



On Thu, Nov 29, 2012 at 8:22 PM, Sandeep Jangra <[email protected]>wrote:

> Yes, I am working on two versions old and new. if I have any luck, I will
> for sure email my findings back.
> Thanks for this input on libjars.
>
>
>
> On Thu, Nov 29, 2012 at 7:55 PM, Mahesh Balija <[email protected]
> > wrote:
>
>> Hi Sandeep,
>>
>>             One important thing here is, if you are passing your jar with
>> hadoop jar your.jar then it is absolutely NOT required to pass it once
>> again with -libjars option.
>>             As Harsh said do recompile once again and see running your
>> command without using -libjars option.
>>             Also you can try your luck by running the JOB in old and new
>> versions.
>>
>> Best,
>> Mahesh Balija,
>> Calsoft Labs.
>>
>>
>> On Fri, Nov 30, 2012 at 2:16 AM, Sandeep Jangra 
>> <[email protected]>wrote:
>>
>>> Hi Harsh,
>>>
>>>   I tried putting the generic option first, but it throws exception file
>>> not found.
>>>    The jar is in current directory. Then I tried giving absolute path of
>>> this jar, but that also brought no luck.
>>>
>>>   sudo -u hdfs hadoop jar word_cnt.jar WordCount2  -libjars=word_cnt.jar
>>> /tmp/root/input /tmp/root/output17
>>> Exception in thread "main" java.io.FileNotFoundException: File
>>> word_cnt.jar does not exist.
>>>  at
>>> org.apache.hadoop.util.GenericOptionsParser.validateFiles(GenericOptionsParser.java:384)
>>> at
>>> org.apache.hadoop.util.GenericOptionsParser.processGeneralOptions(GenericOptionsParser.java:280)
>>>  at
>>> org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:418)
>>> at
>>> org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:168)
>>>  at
>>> org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:151)
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:64)
>>>
>>>   Also, I have been deleting my jars and the class directory before each
>>> new try. So even I am suspicious why do I see this:
>>> "12/11/29 10:20:59 WARN mapred.JobClient: No job jar file set.
>>>  User classes may not be found. See JobConf(Class)
>>> or JobConf#setJar(String)."
>>>
>>>   Could it be that my hadoop is running on old jar files (the one with
>>> package name "mapred" (not mapreduce))
>>>   But my program is using new jars as well.
>>>
>>>   I can try going back to old word count example on the apache site and
>>> using old jars.
>>>
>>>   Any other pointers would be highly appreciated. Thanks
>>>
>>>
>>>
>>> On Thu, Nov 29, 2012 at 2:42 PM, Harsh J <[email protected]> wrote:
>>>
>>>> I think you may have not recompiled your application properly.
>>>>
>>>> Your runtime shows this:
>>>>
>>>> 12/11/29 10:20:59 WARN mapred.JobClient: No job jar file set.  User
>>>> classes may not be found. See JobConf(Class) or
>>>> JobConf#setJar(String).
>>>>
>>>> Which should not appear, cause your code has this (which I suspect you
>>>> may have added later, accidentally?):
>>>>
>>>> job.setJarByClass(WordCount2.class);
>>>>
>>>> So if you can try deleting the older jar and recompiling it, the
>>>> problem would go away.
>>>>
>>>> Also, when passing generic options such as -libjars, etc., they need
>>>> to go first in order. I mean, it should always be [Classname] [Generic
>>>> Options] [Application Options]. Otherwise, they may not get utilized
>>>> properly.
>>>>
>>>> On Fri, Nov 30, 2012 at 12:51 AM, Sandeep Jangra
>>>> <[email protected]> wrote:
>>>> > Yups I can see my class files there.
>>>> >
>>>> >
>>>> > On Thu, Nov 29, 2012 at 2:13 PM, Kartashov, Andy <
>>>> [email protected]>
>>>> > wrote:
>>>> >>
>>>> >> Can you try running jar –tvf word_cnt.jar and see if your static
>>>> nested
>>>> >> classes WordCount2$Map.class and WordCount2$Reduce.class have
>>>> actually been
>>>> >> added to the jar.
>>>> >>
>>>> >>
>>>> >>
>>>> >> Rgds,
>>>> >>
>>>> >> AK47
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> From: Sandeep Jangra [mailto:[email protected]]
>>>> >> Sent: Thursday, November 29, 2012 1:36 PM
>>>> >> To: [email protected]
>>>> >> Subject: Re: Trouble with Word Count example
>>>> >>
>>>> >>
>>>> >>
>>>> >> Also, I did set the HADOOP_CLASSPATH variable to point to the
>>>> word_cnt.jar
>>>> >> only.
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Thu, Nov 29, 2012 at 10:54 AM, Sandeep Jangra <
>>>> [email protected]>
>>>> >> wrote:
>>>> >>
>>>> >> Thanks for the quick response Mahesh.
>>>> >>
>>>> >>
>>>> >>
>>>> >> I am using the following command:
>>>> >>
>>>> >>
>>>> >>
>>>> >> sudo -u hdfs hadoop jar word_cnt.jar WordCount2  /tmp/root/input
>>>> >> /tmp/root/output15  -libjars=word_cnt.jar
>>>> >>
>>>> >> (The input directory exists on the hdfs)
>>>> >>
>>>> >>
>>>> >>
>>>> >> This is how I compiled and packaged it:
>>>> >>
>>>> >>
>>>> >>
>>>> >> javac -classpath
>>>> >> /usr/lib/hadoop-0.20-mapreduce/hadoop-core.jar:/usr/lib/hadoop/*  -d
>>>> >> word_cnt WordCount2.java
>>>> >>
>>>> >> jar -cvf word_cnt.jar -C word_cnt/ .
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Thu, Nov 29, 2012 at 10:46 AM, Mahesh Balija
>>>> >> <[email protected]> wrote:
>>>> >>
>>>> >> Hi Sandeep,
>>>> >>
>>>> >>
>>>> >>
>>>> >>            For me everything seems to be alright.
>>>> >>
>>>> >>            Can you tell us how are you running this job?
>>>> >>
>>>> >>
>>>> >>
>>>> >> Best,
>>>> >>
>>>> >> Mahesh.B.
>>>> >>
>>>> >> Calsoft Labs.
>>>> >>
>>>> >> On Thu, Nov 29, 2012 at 9:01 PM, Sandeep Jangra <
>>>> [email protected]>
>>>> >> wrote:
>>>> >>
>>>> >> Hello everyone,
>>>> >>
>>>> >>
>>>> >>
>>>> >>   Like most others I am also running into some problems while
>>>> running my
>>>> >> word count example.
>>>> >>
>>>> >>   I tried the various suggestion available on internet, but I guess
>>>> it;s
>>>> >> time to go on email :)
>>>> >>
>>>> >>
>>>> >>
>>>> >>   Here is the error that I am getting:
>>>> >>
>>>> >>   12/11/29 10:20:59 WARN mapred.JobClient: Use GenericOptionsParser
>>>> for
>>>> >> parsing the arguments. Applications should implement Tool for the
>>>> same.
>>>> >>
>>>> >> 12/11/29 10:20:59 WARN mapred.JobClient: No job jar file set.  User
>>>> >> classes may not be found. See JobConf(Class) or
>>>> JobConf#setJar(String).
>>>> >>
>>>> >> 12/11/29 10:20:59 INFO input.FileInputFormat: Total input paths to
>>>> process
>>>> >> : 1
>>>> >>
>>>> >> 12/11/29 10:20:59 INFO util.NativeCodeLoader: Loaded the
>>>> native-hadoop
>>>> >> library
>>>> >>
>>>> >> 12/11/29 10:20:59 WARN snappy.LoadSnappy: Snappy native library is
>>>> >> available
>>>> >>
>>>> >> 12/11/29 10:20:59 INFO snappy.LoadSnappy: Snappy native library
>>>> loaded
>>>> >>
>>>> >> 12/11/29 10:21:00 INFO mapred.JobClient: Running job:
>>>> >> job_201210310210_0040
>>>> >>
>>>> >> 12/11/29 10:21:01 INFO mapred.JobClient:  map 0% reduce 0%
>>>> >>
>>>> >> 12/11/29 10:21:07 INFO mapred.JobClient: Task Id :
>>>> >> attempt_201210310210_0040_m_000000_0, Status : FAILED
>>>> >>
>>>> >> java.lang.RuntimeException: java.lang.ClassNotFoundException: Class
>>>> >> WordCount2$Map not found
>>>> >>
>>>> >> at
>>>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1439)
>>>> >>
>>>> >> at
>>>> >>
>>>> org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:191)
>>>> >>
>>>> >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:605)
>>>> >>
>>>> >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
>>>> >>
>>>> >> at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>>>> >>
>>>> >> at java.security.AccessController.doPrivileged(Native Method)
>>>> >>
>>>> >> at javax.security.auth.Subject.doAs(Subject.java:416)
>>>> >>
>>>> >> at
>>>> >>
>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>>>> >>
>>>> >> at org.apache.hadoop.mapred.Child.main(Child.java:264)
>>>> >>
>>>> >> Caused by: java.lang.ClassNotFoundException: Class WordCount2$Map not
>>>> >> found
>>>> >>
>>>> >> at
>>>> >>
>>>> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1350)
>>>> >>
>>>> >> at
>>>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1437)
>>>> >>
>>>> >> ... 8 more
>>>> >>
>>>> >>
>>>> >>
>>>> >> And here is the source code:
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> import org.apache.hadoop.conf.Configuration;
>>>> >>
>>>> >> import org.apache.hadoop.conf.Configured;
>>>> >>
>>>> >> import org.apache.hadoop.fs.Path;
>>>> >>
>>>> >> import org.apache.hadoop.io.IntWritable;
>>>> >>
>>>> >> import org.apache.hadoop.io.LongWritable;
>>>> >>
>>>> >> import org.apache.hadoop.io.Text;
>>>> >>
>>>> >> import org.apache.hadoop.mapreduce.Job;
>>>> >>
>>>> >> import org.apache.hadoop.mapreduce.Mapper;
>>>> >>
>>>> >> import org.apache.hadoop.mapreduce.Reducer;
>>>> >>
>>>> >> import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
>>>> >>
>>>> >> import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
>>>> >>
>>>> >> import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
>>>> >>
>>>> >> import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
>>>> >>
>>>> >> import org.apache.hadoop.util.Tool;
>>>> >>
>>>> >> import org.apache.hadoop.util.ToolRunner;
>>>> >>
>>>> >>
>>>> >>
>>>> >> import java.io.IOException;
>>>> >>
>>>> >> import java.util.StringTokenizer;
>>>> >>
>>>> >>
>>>> >>
>>>> >> public class WordCount2 extends Configured implements Tool {
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>     public static class Map extends Mapper<LongWritable, Text, Text,
>>>> >> IntWritable> {
>>>> >>
>>>> >>         private final static IntWritable one = new IntWritable(1);
>>>> >>
>>>> >>         private Text word = new Text();
>>>> >>
>>>> >>
>>>> >>
>>>> >>         @Override
>>>> >>
>>>> >>         protected void map(LongWritable key, Text value, Context
>>>> context)
>>>> >> throws IOException, InterruptedException {
>>>> >>
>>>> >>             String line = value.toString();
>>>> >>
>>>> >>             StringTokenizer tokenizer = new StringTokenizer(line);
>>>> >>
>>>> >>             while (tokenizer.hasMoreTokens()) {
>>>> >>
>>>> >>                 word.set(tokenizer.nextToken());
>>>> >>
>>>> >>                 context.write(word, one);
>>>> >>
>>>> >>             }
>>>> >>
>>>> >>         }
>>>> >>
>>>> >>
>>>> >>
>>>> >>     }
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>     public static class Reduce extends Reducer<Text, IntWritable,
>>>> Text,
>>>> >> IntWritable> {
>>>> >>
>>>> >>
>>>> >>
>>>> >>         @Override
>>>> >>
>>>> >>         protected void reduce(Text key, Iterable<IntWritable> values,
>>>> >> Context context) throws IOException, InterruptedException {
>>>> >>
>>>> >>
>>>> >>
>>>> >>             int sum = 0;
>>>> >>
>>>> >>
>>>> >>
>>>> >>             for(IntWritable value : values) {
>>>> >>
>>>> >>                 sum += value.get();
>>>> >>
>>>> >>             }
>>>> >>
>>>> >> //                    while (values.hasNext()) {
>>>> >>
>>>> >> //                          sum += values.next().get();
>>>> >>
>>>> >> //                        }
>>>> >>
>>>> >>             context.write(key, new IntWritable(sum));
>>>> >>
>>>> >>         }
>>>> >>
>>>> >>
>>>> >>
>>>> >>     }
>>>> >>
>>>> >>
>>>> >>
>>>> >>     @Override
>>>> >>
>>>> >>     public int run(String[] args) throws Exception {
>>>> >>
>>>> >>         Configuration conf = getConf();
>>>> >>
>>>> >>         for (java.util.Map.Entry<String, String> entry: conf) {
>>>> >>
>>>> >>             System.out.printf("%s=%s\n", entry.getKey(),
>>>> >> entry.getValue());
>>>> >>
>>>> >>         }
>>>> >>
>>>> >>
>>>> >>
>>>> >>         System.out.println("arg[0]= "+args[0] + " args[1]= "+
>>>> args[1]);
>>>> >>
>>>> >>
>>>> >>
>>>> >>         Job job = new Job(conf, WordCount2.class.getSimpleName());
>>>> >>
>>>> >>         job.setJobName("wordcount2");
>>>> >>
>>>> >>         job.setJarByClass(WordCount2.class);
>>>> >>
>>>> >>
>>>> >>
>>>> >>         job.setMapOutputKeyClass(Text.class);
>>>> >>
>>>> >>         job.setMapOutputValueClass(IntWritable.class);
>>>> >>
>>>> >>
>>>> >>
>>>> >>         job.setOutputKeyClass(Text.class);
>>>> >>
>>>> >>         job.setOutputValueClass(IntWritable.class);
>>>> >>
>>>> >>
>>>> >>
>>>> >>         job.setMapperClass(Map.class);
>>>> >>
>>>> >>         job.setCombinerClass(Reduce.class);
>>>> >>
>>>> >>         job.setReducerClass(Reduce.class);
>>>> >>
>>>> >>
>>>> >>
>>>> >>         job.setInputFormatClass(TextInputFormat.class);
>>>> >>
>>>> >>         job.setOutputFormatClass(TextOutputFormat.class);
>>>> >>
>>>> >>
>>>> >>
>>>> >>         FileInputFormat.setInputPaths(job, new Path(args[0]));
>>>> >>
>>>> >>         FileOutputFormat.setOutputPath(job, new Path(args[1]));
>>>> >>
>>>> >>
>>>> >>
>>>> >>         System.exit(job.waitForCompletion(true) ? 0 : 1);
>>>> >>
>>>> >>
>>>> >>
>>>> >>         return 0;
>>>> >>
>>>> >>     }
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>     public static void main(String[] args) throws Exception {
>>>> >>
>>>> >>         int exitCode = ToolRunner.run(new WordCount2(), args);
>>>> >>
>>>> >>         System.exit(exitCode);
>>>> >>
>>>> >>     }
>>>> >>
>>>> >> }
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> NOTICE: This e-mail message and any attachments are confidential,
>>>> subject
>>>> >> to copyright and may be privileged. Any unauthorized use, copying or
>>>> >> disclosure is prohibited. If you are not the intended recipient,
>>>> please
>>>> >> delete and contact the sender immediately. Please consider the
>>>> environment
>>>> >> before printing this e-mail. AVIS : le présent courriel et toute
>>>> pièce
>>>> >> jointe qui l'accompagne sont confidentiels, protégés par le droit
>>>> d'auteur
>>>> >> et peuvent être couverts par le secret professionnel. Toute
>>>> utilisation,
>>>> >> copie ou divulgation non autorisée est interdite. Si vous n'êtes pas
>>>> le
>>>> >> destinataire prévu de ce courriel, supprimez-le et contactez
>>>> immédiatement
>>>> >> l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le
>>>> présent
>>>> >> courriel
>>>> >
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>> Harsh J
>>>>
>>>
>>>
>>
>

Re: Trouble with Word Count example

Reply via email to