Re: Writing another data process

Oded Rosen Sat, 20 Mar 2010 14:38:57 -0700

forgot to add: my reducer is located in the right package:
org.apache.hadoop.chukwa.extraction.demux.processor.reducer
my mapper is located on the mapper package.


On Sat, Mar 20, 2010 at 11:36 PM, Oded Rosen <o...@legolas-media.com> wrote:

> Hi,
> I have a strange error with a chukwa parser that I wrote.
>
> The reducer class is implementing
> org.apache.hadoop.chukwa.extraction.demux.processor.ReduceProcessor.
> I'm setting (on the map class):
> key.setReduceType(RawReducer.class.getName());
>
> and on the reducer I have:
> @Override
>     public String getDataType() {
>         return this.getClass().getName();
>     }
>
> The demux conf redirects my data type to my mapper class.
>
> I've plugged these classes in a jar at hdfs:/../chukwa/demux folder, but
> the reducer will not execute.
> I get a map/reduce job with a map input of a few million bytes, but map
> output bytes always equals 0.
> No output is written to the repos or any other dir on hdfs/../chukwa.
> I guess the output is empty because the demux cannot find my reducer.
> I've tried to put these classes in the chukwa-core jar, with the same
> results.
>
> I've already successfully written a only-mapper solution, but I need the
> reducer this time.
> What am I doing wrong?
>
> Thanks in advance,
>
>
> On Wed, Mar 10, 2010 at 6:28 AM, Eric Yang <ey...@yahoo-inc.com> wrote:
>
>> Hi Oded,
>>
>> For Chukwa 0.3, it does not support external class file.  For TRUNK, you
>> can
>> create your own parser to run in dmux.  The parser class should extend
>> org.apache.hadoop.chukwa.extraction.demux.processor.AbstractProcessor for
>> mapper or implements
>> org.apache.hadoop.chukwa.extraction.demux.processor.ReduceProcessor for
>> reducer.  Edit CHUKWA_CONF/chukwa-demux-conf.xml, and reference the
>> RecordType to your class names.
>>
>> After you have both class files and chukwa-demux-conf.xml file, put your
>> jar
>> file in hdfs://namenode:port/chukwa/demux and the next demux job will pick
>> up the parser and run them automatically.  Duplication detection should be
>> handled by your mapper or reducer class, or a post demux step.  Chukwa
>> does
>> not offer duplication detection currently.  Hope this helps.
>>
>> Regards,
>> Eric
>>
>>
>>
>> On 3/9/10 1:01 PM, "Oded Rosen" <o...@legolas-media.com> wrote:
>>
>> > Hi,
>> >
>> > I wonder if one can write an additional data process (in addition to the
>> Demux
>> > + Archiving processes).
>> > The option of writing a plug-in demux class is available, but can I
>> write
>> > another processes of my own to run in parallel do the demux+archiving,
>> on the
>> > same data?
>> > What does it take?
>> > What classes should be inherited?
>> > How do I configure it (eg tell chukwa to apply it on every piece of
>> data)?
>> > Do I have to deal with duplications myself?
>> >
>> > Thanks a lot,
>>
>>
>
>
> --
> Oded
>



-- 
Oded

Re: Writing another data process

Reply via email to