Hi Ari, Demux framework has been modified to operate in two modes. First, map reduce mode is fully backward compatible with Chukwa 0.4 demux. Second, Chukwa collector uses HBaseWriter, which implements it's own OutputCollector and invokes demux parsers. This makes it easy to write one parser which work in both modes.
Take a look of org.apache.hadoop.chukwa.extraction.demux.processor.mapper.SystemMetrics. All demux parsers extends AbstractProcessor class, and implement parse function. The input of parse function is basically Chukwa chunks in string, output collector and reporter class. A special function called: buildGenericRecord(ChukwaRecord record, String body, long timestamp, String reduceType); ChukwaRecord is basically a HashMap, and it is grouped by reduceType, timestamp, and primary key (i.e. csource). In the HBase mode, reduceType maps to columnFamily name. Timestamp + Primary key is mapped to Row Key in HBase. The table name is defined by annotation at beginning of the class. HBaseWriter's OutputCollector takes the output spill out by the parse function, and put the records into HBase. In Summary, to develop a demux processor: 1. Extend AbstractProcessor 2. Annotate table name 3. Implement parse function 4. Configure chukwa-demux-conf.xml to map data type to the new Parser 5. Create hbase schema 6. Restart collector with the new jar and watch data flow and show up in HICC regards, Eric On Mon, Dec 27, 2010 at 6:12 PM, Ariel Rabkin <[email protected]> wrote: > Howdy. > > I'm gearing up to make use of the new Demux framework. I have several > site-specific metrics that I want to use Chukwa to collect and graph. > > I'm a little vague about how to do this. I think I see what the HBase > metric creation needs to be. But what do I need to do in the way of > Demux processors? > > What input format does HICC expect / what's the output format supposed > to be? Which are the right examples for me to look at? Is anything > documented yet? Who has done this already? > > > -Ari > > -- > Ari Rabkin [email protected] > UC Berkeley Computer Science Department >
