Current Demux if i remember well is using reducerType for the
partitionning.
If you don't need a global sort per datatype then you can write your
own partionner based on the key instead of the dataType.
/Jerome
On Mar 28, 2010, at 6:53, "Oded Rosen" <o...@legolas-media.com> wrote:
Hey everyone,
Thanks to your help (especially by Eric & Jerome), I've managed to
write my own little demux processor, including a customized mapper &
reducer, for my data type.
For now, all of my map output is sent to only reduce process
(although Chukwa opens 8 different reduce processes in each demux
run).
I would like to exploit the whole cluster, and to have multiple
reduce processes (same reducer class, of course, just many instances
of them).
I've tried to do it by setting different values to
ChukwaRecordKey.setKey() in my mapper:
protected void parse(String recordEntry,
OutputCollector<ChukwaRecordKey, ChukwaRecord> output,
Reporter reporter) throws Throwable {
key = new ChukwaRecordKey();
String keyStr = DATA_TYPE + Math.floor((NUM_OF_REDUCERS*Math.random
()))+1;
ChukwaRecord record = new ChukwaRecord();
this.buildGenericRecord(record, null, timestamp, keyStr);
key.setKey(keyStr);
key.setReduceType(ReducerName);
.... (record logic)....
output.collect(key, record);
}
Although I have multiple keys, all of the records are still sent to
the same reducer process.
How can I send records to different processes?
Thanks a lot,
--
Oded