Re: DecreasingComparator

Peter W . Mon, 02 Jul 2007 14:05:24 -0700

Devaraj,

You are correct that I wanted to order by url count.


The partition file output of a Hadoop task  seems to be a dump
of the key value pairs so generally I'm interested in expressing
Collections.sort(value) in either the map or reduce method.

Regards,

Peter W.



On Jul 1, 2007, at 9:39 PM, Devaraj Das wrote:

Your first MapReduce phase is very similar to the WordCountexample. Theonly difference is that you need to create LongWritable objects forthe
values. The output format should be SequenceFileOutputFormat.class.

Run a subsequent MapReduce phase with the input format set to
SequenceFileInputFormat.class, the map class set toInverseMapping.class,and, the OutputKeyComparator set toLongWritable.DecreasingComparator.class.
By the way, the 2nd mapreduce phase won't work unless you patchyour version
of hadoop with
https://issues.apache.org/jira/secure/attachment/12360717/1535_01.patch .
This hasn't been committed yet.

-----Original Message-----
From: Peter W. [mailto:[EMAIL PROTECTED]
Sent: Monday, July 02, 2007 6:08 AM
To: [email protected]
Subject: DecreasingComparator

Hello,
I have a modified WordCount program with the followingcharacteristics:
input file:
urla.com,urlb.com
urla.com,urlc.com
urlb.com,urlc.com
urlc.com,urla.com
urld.com,urlc.com

mapreduce output:
urla.com 3
urlb.com 2
urlc.com 4
urld.com 1

Next, tried using a comparator with a different JobConf and mapreduce:
jc.setOutputKeyComparatorClass(LongWritable.DecreasingComparator.class);
but it didn't work because the values are IntWritable and myOutputCollector
wasn't picking up the right things...
What do I need to collect in both the map and reduce for the finalresult to
sort descending high-low?

Thanks,

Peter W.

Re: DecreasingComparator

Reply via email to