Hey all, So, I'm writing a module where I need to do aggregate operations over the entire set of data. However, I also want to use multiple reducers.
For example, let's say each row of input data looks like this : DATE_TIME_TEMPERATURE Let's say my mapper outputs DATE_TIME as a key, and the temperature as a value. In this example, I'm using 3 reducers, which should create three output files. I want to find the day with the highest temperature in the entire data set. I know I could just write a script that examines the output from the reducers and picks out the value with the highest temperature. I could also write a mapreduce job that does the same thing, and chain the two jobs together. However, these solutions seem kinda wrong to me. What's the commonly-accepted best way to do this? --Jeremy
