Hi, Is there any way we can chain the reducers . As in initially the reducers work on some data. The output of these reducers is again sent to the same reducers again and so on. Similar to how the conquer step takes place in divide and conquer algorithms ? I hope you got what I am trying to ask ? The problem that I am trying to actually solve is not sorting but some thing which can be solved by the divide and conquer algorithm
Best Regards from Buffalo Abhishek Agrawal SUNY- Buffalo (716-435-7122) On Sun 02/28/10 3:24 PM , Ed Mazur [email protected] sent: > Hi Abhishek, > > If you use input lines as your output keys in map, Hadoop internals > will do the work for you and the keys will appear in sorted order in > your reduce (you can use IdentityReducer). This needs a slight > adjustment if your input lines aren't unique. > > If you have R reducers, this will create R sorted files. If you want a > single sorted file, you can merge the R files or use 1 reducer. > Another way is to use TotalOrderPartitioner which will ensure all keys > in reduce N come after all keys in reduce N-1. > > Owen O'Malley and Arun C. Murthy's paper [1] about using Hadoop to win > a sorting competition might be of interest to you. > > Ed > > [1] http://sortbenchmark.org/Yahoo2009.pdf > On Sun, Feb 28, 2010 at 1:53 PM, <aa...@buffa > lo.edu> wrote:> Hello, > > Â Â Â I am > trying to write a simple sorting application for hadoop. This is > what> I have thought till now. Suppose I have 100 > lines of data and 10 mappers, each of> the 10 mappers will sort the data given to it. > But I am unable to figure out is> how to join these outputs to one big sorted > array. In other words what should be> the code to be written in the reduce > ?> > > > > Best Regards from Buffalo > > > > Abhishek Agrawal > > > > SUNY- Buffalo > > (716-435-7122) > > > > > > > > > > > > >
