Re: Writing another data process

Eric Yang Sun, 21 Mar 2010 16:33:37 -0700

I believe the right reducer class name is
org.apache.hadoop.chukwa.extraction.demux.processor.reducer.ReduceProcessor.
The problem is that the setReduceType is expecting the suffix of the Class
name, i.e. SystemMetrics.  Hence you should change: to
key.setReduceType(RawReducer.class.getSimpleName());  Hope this helps.


Regards,
Eric

On 3/20/10 2:36 PM, "Oded Rosen" <o...@legolas-media.com> wrote:

> Hi,
> I have a strange error with a chukwa parser that I wrote.
> 
> The reducer class is implementing
> org.apache.hadoop.chukwa.extraction.demux.processor.ReduceProcessor.
> I'm setting (on the map class):
> key.setReduceType(RawReducer.class.getName());
> 
> and on the reducer I have:
> @Override
>     public String getDataType() {
>         return this.getClass().getName();
>     }
> 
> The demux conf redirects my data type to my mapper class.
> 
> I've plugged these classes in a jar at hdfs:/../chukwa/demux folder, but the
> reducer will not execute.
> I get a map/reduce job with a map input of a few million bytes, but map output
> bytes always equals 0.
> No output is written to the repos or any other dir on hdfs/../chukwa.
> I guess the output is empty because the demux cannot find my reducer.
> I've tried to put these classes in the chukwa-core jar, with the same results.
> 
> I've already successfully written a only-mapper solution, but I need the
> reducer this time.
> What am I doing wrong?
> 
> Thanks in advance,
> 
> On Wed, Mar 10, 2010 at 6:28 AM, Eric Yang <ey...@yahoo-inc.com> wrote:
>> Hi Oded,
>> 
>> For Chukwa 0.3, it does not support external class file.  For TRUNK, you can
>> create your own parser to run in dmux.  The parser class should extend
>> org.apache.hadoop.chukwa.extraction.demux.processor.AbstractProcessor for
>> mapper or implements
>> org.apache.hadoop.chukwa.extraction.demux.processor.ReduceProcessor for
>> reducer.  Edit CHUKWA_CONF/chukwa-demux-conf.xml, and reference the
>> RecordType to your class names.
>> 
>> After you have both class files and chukwa-demux-conf.xml file, put your jar
>> file in hdfs://namenode:port/chukwa/demux and the next demux job will pick
>> up the parser and run them automatically.  Duplication detection should be
>> handled by your mapper or reducer class, or a post demux step.  Chukwa does
>> not offer duplication detection currently.  Hope this helps.
>> 
>> Regards,
>> Eric
>> 
>> 
>> 
>> On 3/9/10 1:01 PM, "Oded Rosen" <o...@legolas-media.com> wrote:
>> 
>>> Hi,
>>> 
>>> I wonder if one can write an additional data process (in addition to the
>>> Demux
>>> + Archiving processes).
>>> The option of writing a plug-in demux class is available, but can I write
>>> another processes of my own to run in parallel do the demux+archiving, on
>>> the
>>> same data?
>>> What does it take?
>>> What classes should be inherited?
>>> How do I configure it (eg tell chukwa to apply it on every piece of data)?
>>> Do I have to deal with duplications myself?
>>> 
>>> Thanks a lot,
>> 
> 
>

Re: Writing another data process

Reply via email to