Does that mean that when the job.waitForCompletion(true) returns that I have the results from the Reducer(s) available to me? I haven't seen much on coprocessors, can you point me to some examples of their use?
Thanks -Pete -----Original Message----- From: Jonathan Gray [mailto:[email protected]] Sent: Friday, December 17, 2010 11:13 AM To: [email protected] Subject: RE: Results from a Map/Reduce Hey Peter, That System.exit line is nothing important, just the main thread waiting for the tasks to finish before closing. You're interested in having the MR job return a single result? To do that, you would need to roll-up the processing done in each of your Map tasks into a single Reduce task. With one reducer, you can have a single point to do the final aggregation of the result. I'm not sure exactly what kind of aggregation you are doing but funneling into a single reducer can range from no problem to don't even try it. Sounds like you just want a final number or something so shouldn't be an issue. You might also consider doing your aggregations with coprocessors if you're into experimenting on HBase Trunk :) As for FirstKeyOnlyFilter: /** * A filter that will only return the first KV from each row. * <p> * This filter can be used to more efficiently perform row count operations. */ That's what it does. If you scan a table, regardless of what you ask for in the query, the filter will just return whatever the first KeyValue is on each row and will skip every other column/version/value of that row except the first. Like it says, it's generally useful for doing row counting but that's about it. JG > -----Original Message----- > From: Peter Haidinyak [mailto:[email protected]] > Sent: Friday, December 17, 2010 10:56 AM > To: [email protected] > Subject: Results from a Map/Reduce > > Hi, dumb question again. > I have been using a Scan to return a result back to my client which works > fine except when I am returning a million rows just to aggregate the results. > The next logical step would be to do the aggregation in a Map/Reduce. I've > been looking at what samples I could find and they see to all do this... > > System.exit(job.waitForCompletion(true) ? 0 : 1); > > My question, is there a way to return a result from the job in a similar way > of > getting a ResultScanner back in iterating through the results? > > Also, is there a good definition of what a 'FirstKeyOnlyFilter' does? > > Thanks > > -Pete
