date:20120308

develop and deploy MapReduce

2012-03-08 Thread Viney Gupta

Hi all, I am new to Hadoop and just start coding in MapReduce. I've checked out the trunk and am able to build the MapReduce project. I also import the code to the eclipse. My very first goal is to add a few printout statements, locally build the MR jar, deploy it to a testbed, and run a test prog

Re: develop and deploy MapReduce

2012-03-08 Thread AnilKumar B

Hi Viney, Instead of adding sysout's and building every time, I will suggest you to set up dev env for debugging. As you are already built the mapreduce project, you can add debug conf's in yarn-en.sh and put debug points in the code and start analyzing it. 1) Copy yarn-env.sh( hadoop-dist/tar

reduce stop after n records

2012-03-08 Thread Henry Helgen

I am using hadoop 0.20.2 mapreduce API. The program is running fine, just slower than it could. I sum values and then use job.setSortComparatorClass(LongWritable.DecreasingComparator.class) to sort descending by sum. I need to stop the reducer after outputting the first N records. This would save