extending PutSortReducer

Amit Sela Tue, 16 Oct 2012 08:18:58 -0700

Hi all,

Has anyone tried extending PutSortReducer in order to add some traditional
reduce logic (i.e, aggregating counters) ?


I want to process data with hadoop mapreduce job (aggregate counters per
keys - traditional hadoop mr) but I want to bulk load the reduce output to
HBase.
As I understand things, the "native" way to do it is to run two jobs, the
first to aggregate counters by keys and the second to create Puts(Map
phase) and bulk load into HBAse
(HFileOutputFormat.configureIncrementalLoad()).

I was thinking of combining the two into one mapreduce where the Map of the
first job is the Map of the combined job and the Reducer  of the new job
will extend PutSortReducer so that the reduce logic of the first job is
implemented and then PutSortReducer reduce goes into action to write out as
KeyValue.

Any thoughts ? Anyone tried something similar and has something to add /
correct ?

Thanks,
Amit.

extending PutSortReducer

Reply via email to