This tutorial should help: http://hadoop.apache.org/mapreduce/docs/r0.21.0/mapred_tutorial.html
Tom On Thu, Sep 23, 2010 at 1:24 AM, Martin Becker <[email protected]> wrote: > Hi, > I would still like to use the new API. So what I am trying to do now is to > not use the command line interface to submit a job, but do it from Java > code. How do I do this? This is what I do at the moment: > * Clean start up of Hadoop (formatted file system and all) > * Using the standard WordCount Mapper and Reducer I wrote this main method: > > public static void main(String[] args) throws IOException, > InterruptedException, ClassNotFoundException { > > Configuration configuration = new Configuration(); > InetSocketAddress socket = new InetSocketAddress("localhost", 9001); > Cluster cluster = new Cluster(socket, configuration); > > FileSystem fs = cluster.getFileSystem(); > Path homeDirectory = fs.getHomeDirectory(); > > Path input = new Path(homeDirectory, INPUT); > Path output = new Path(homeDirectory, OUTPUT); > > fs.delete(output, true); > fs.copyFromLocalFile(new > Path("resources/test/wordcount/data/ipsum.txt"), new Path(input, > "input.txt")); > > Job job = Job.getInstance(cluster); > > //1 job.addArchiveToClassPath(new Path("release/test.jar")); > > //2 job.addFileToClassPath(new > Path("bin/de/fstyle/hadoop/tutorial/wordcount/WordCount.class")); > // job.addFileToClassPath(new > Path("bin/de/fstyle/hadoop/tutorial/wordcount/WordCount$Map.class")); > // job.addFileToClassPath(new > Path("bin/de/fstyle/hadoop/tutorial/wordcount/WordCount$Reduce.class")); > > job.setJarByClass(WordCount.class); > job.setMapperClass(Map.class); > job.setCombinerClass(Reduce.class); > job.setReducerClass(Reduce.class); > job.setOutputKeyClass(Text.class); > job.setOutputValueClass(IntWritable.class); > FileInputFormat.addInputPath(job, input); > FileOutputFormat.setOutputPath(job, output); > > System.exit(job.waitForCompletion(true) ? 0 : 1); > > } > * I tried to run this code as is in Eclipse. > * Obviously, I guess, Hadoop needed the WordClass classes to work so I got > this error: > java.lang.RuntimeException: java.lang.ClassNotFoundException: > de.fstyle.hadoop.tutorial.wordcount.WordCount$Map > * Putting everything into a jar and adding the following line did not do any > good: > job.addArchiveToClassPath(new Path("release/test.jar")); > * Adding each class separately throws the same exception: > job.addFileToClassPath(new > Path("bin/de/fstyle/hadoop/tutorial/wordcount/WordCount.class")); > job.addFileToClassPath(new > Path("bin/de/fstyle/hadoop/tutorial/wordcount/WordCount$Map.class")); > job.addFileToClassPath(new > Path("bin/de/fstyle/hadoop/tutorial/wordcount/WordCount$Reduce.class")); > * Using > job.setJar("release/test.jar"); > Will get me > java.io.FileNotFoundException: File > /tmp/hadoop-martin/mapred/staging/martin/.staging/job_201009221802_0033/job.jar > does not exist. > > So how would I set this up/use oi correctly? Sorry, I did not find any > tutorial or examples anywhere. > > Martin > > > On 22.09.2010 18:29, Tom White wrote: > > Note that JobClient, along with the rest of the "old" API in > org.apache.hadoop.mapred, has been undeprecated in Hadoop 0.21.0 so > you can continue to use it without warnings. > > Tom > > On Wed, Sep 22, 2010 at 2:43 AM, Amareshwari Sri Ramadasu > <[email protected]> wrote: > > In 0.21, JobClient methods are available in org.apache.hadoop.mapreduce.Job > and org.apache.hadoop.mapreduce.Cluster classes. > > On 9/22/10 3:07 PM, "Martin Becker" <[email protected]> wrote: > > Hello, > > I am using the Hadoop MapReduce version 0.20.2 and soon 0.21. > I wanted to use the JobClient class to circumvent the use of the command > line interface. > I am noticed that JobClient still uses the deprecated JobConf class for > jib submissions. > Are there any alternatives to JobClient not using the deprecated JobConf > class? > > Thanks in advance, > Martin > > > > >
