Hi guys! Any thoughts on this? Should I have sent my queries to a different distribution list?
Thanks! Pony On Mon, May 23, 2011 at 5:36 PM, Juan P. <[email protected]> wrote: > Hi guys, > I wanted to get your help with a couple of questions which came up while > looking at the Hadoop Comparator/Comparable architecture. > > As I see it before each reducer operates on each key, a sorting algorithm > is applied to them. *Why does Hadoop need to do that?* > > If I implement my own class and I intend to use it as a Key I must allow > for instances of my class to be compared. So I have 2 choices: I can > implement WritableComparable or I can register a WritableComparator for my > class. Should I fail to do either, would the Job fail? > If I register my WritableComparator which does not use the Comparable > interface at all, does my Key need to implement WritableComparable? > If I don't implement my Comparator and my Key implements > WritableComparable, does it mean that Hadoop will deserialize my Keys twice? > (once for sorting, and once for reducing) > What is RawComparable used for? > > Thanks for your help! > Pony > >
