Hi, I am using a mac. I have not used maven before, I am new to hadoop and eclipse.
Any directions to start a project as map reuce as per all the videos on youtube. Thanks, Bharati On May 22, 2013, at 4:23 PM, Sanjay Subramanian <[email protected]> wrote: > Hi > > I don't use any need any special plugin to walk thru the code > > All my map reduce jobs have a > > JobMapper.java > JobReducer.java > JobProcessor.java (set any configs u like) > > I create a new maven project in eclipse (easier to manage dependencies) ….the > elements are in the order as they should appear in the POM > > Then In Eclipse Debug Configurations I create a new JAVA application and then > I start debugging ! That’s it….. > > > MAVEN REPO INFO > ================ > <repositories> > <repository> > <id>Cloudera repository</id> > <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url> > </repository> > </repositories> > > <properties> > <cloudera_version>2.0.0-cdh4.1.2</cloudera_version> > </properties> > > <dependency> > <groupId>org.apache.hadoop</groupId> > <artifactId>hadoop-mapreduce-client-core</artifactId> > <version>${cloudera_version}</version> > <scope>compile</scope> > </dependency> > <dependency> > <groupId>org.apache.hadoop</groupId> > <artifactId>hadoop-common</artifactId> > <version>${cloudera_version}</version> > <scope>compile</scope> > </dependency> > <dependency> > <groupId>org.apache.hadoop</groupId> > <artifactId>hadoop-client</artifactId> > <version>${cloudera_version}</version> > <scope>compile</scope> > </dependency> > <dependency> > <groupId>org.apache.hadoop</groupId> > <artifactId>hadoop-client</artifactId> > <version>${cloudera_version}</version> > <scope>compile</scope> > </dependency> > > WordCountNew (please modify as needed) > ====================================== > > public class WordCountNew { > > > > public static class Map extends > org.apache.hadoop.mapreduce.Mapper<LongWritable, Text, Text, IntWritable> { > > private final static IntWritable one = new IntWritable(1); > > private Text word = new Text(); > > > > public void map(LongWritable key, Text value, Context ctxt) throws > IOException, InterruptedException { > FileSplit fileSplit = (FileSplit)ctxt.getInputSplit(); > // System.out.println(value.toString()); > String fileName = fileSplit.getPath().toString(); > String line = value.toString(); > StringTokenizer tokenizer = new StringTokenizer(line); > while (tokenizer.hasMoreTokens()) { > word.set(tokenizer.nextToken()); > ctxt.write(word, one); > > } > > } > > } > > > > public static class Reduce extends > org.apache.hadoop.mapreduce.Reducer<Text, IntWritable, Text, IntWritable> { > > public void reduce(Text key, Iterable<IntWritable> values, Context ctxt) > throws IOException, InterruptedException { > > int sum = 0; > > for (IntWritable value : values) { > > sum += value.get(); > > } > > ctxt.write(key, new IntWritable(sum)); > > } > > } > > > > public static void main(String[] args) throws Exception { > org.apache.hadoop.conf.Configuration hadoopConf = new > org.apache.hadoop.conf.Configuration(); > hadoopConf.set(MapredConfEnum.IMPRESSIONS_LOG_REC_SEPARATOR.getVal(), > MapredConfEnum.PRODUCT_IMPR_LOG_REC_END.getVal()); > hadoopConf.set(MapredConfEnum.IMPRESSIONS_LOG_REC_CACHED_SEPARATOR.getVal(), > MapredConfEnum.PRODUCT_IMPR_LOG_REC_CACHED.getVal()); > hadoopConf.set("io.compression.codecs", > "org.apache.hadoop.io.compress.GzipCodec"); > > > Job job = new Job(hadoopConf); > > job.setJobName("wordcountNEW"); > > job.setJarByClass(WordCountNew.class); > > job.setOutputKeyClass(Text.class); > > job.setOutputValueClass(IntWritable.class); > > job.setMapOutputKeyClass(Text.class); > > job.setMapOutputValueClass(IntWritable.class); > > > > job.setMapperClass(WordCountNew.Map.class); > > job.setCombinerClass(WordCountNew.Reduce.class); > > job.setReducerClass(Reduce.class); > > // > job.setInputFormatClass(ZipMultipleLineRecordInputFormat.class); > > > job.setInputFormatClass(org.apache.hadoop.mapreduce.lib.input.TextInputFormat.class); > > > job.setOutputFormatClass(TextOutputFormat.class); > > > if (FileUtils.doesFileOrDirectoryExist(args[1])){ > > org.apache.commons.io.FileUtils.deleteDirectory(new File(args[1])); > > } > > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.setInputPaths(job, > new Path(args[0])); > > org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.setOutputPath(job, > new Path(args[1])); > > > > job.waitForCompletion(true); > > System.out.println(); > > } > } > > > > > > From: Bharati <[email protected]> > Reply-To: "[email protected]" <[email protected]> > Date: Wednesday, May 22, 2013 3:39 PM > To: "[email protected]" <[email protected]> > Subject: Re: Eclipse plugin > > Hi Jing, > > I want to be able to open a project as map reduce project in eclipse instead > of java project as per some of the videos on youtube. > > For now let us say I want to write a wordcount program and step through it > with hadoop 1.2.0 > How can I use eclipse to rewrite the code. > > The goal here is to setup the development env to start project as mad reduce > right in eclipse or netbeans which ever works better. The idea is to be able > to step through the code. > > Thanks, > Bharati > > Sent from my iPad > > On May 22, 2013, at 2:42 PM, Jing Zhao <[email protected]> wrote: > > > Hi Bharati, > > > > Usually you only need to run "ant clean jar jar-test" and "ant > > eclipse" on your code base, and then import the project into your > > eclipse. Can you provide some more detailed description about the > > problem you met? > > > > Thanks, > > -Jing > > > > On Wed, May 22, 2013 at 2:25 PM, Bharati <[email protected]> > > wrote: > >> Hi, > >> > >> I am trying to get or build eclipse plugin for 1.2.0 > >> > >> All the methods I found on the web did not work for me. Any tutorial, > >> methods to build the plugin will help. > >> > >> I need to build a hadoop map reduce project and be able to debug in > >> eclipse. > >> > >> Thanks, > >> Bharati > >> Sent from my iPad > >> Fortigate Filtered > >> > Fortigate Filtered > > CONFIDENTIALITY NOTICE > ====================== > This email message and any attachments are for the exclusive use of the > intended recipient(s) and may contain confidential and privileged > information. Any unauthorized review, use, disclosure or distribution is > prohibited. If you are not the intended recipient, please contact the sender > by reply email and destroy all copies of the original message along with any > attachments, from your computer system. If you are the intended recipient, > please be advised that the content of this message is subject to access, > review and disclosure by the sender's Email System Administrator.
