Thanks Jerome,

Actually my use case is like this :

Step 1 : I give my logs to chukwa and it process the logs to generate the 
sequence files.

Step 2 : Then I feed this output sequence files to my map-reduce program to 
convert it into Avro format.

Problem in Step 1 :

My Map-Reduce program expects <ChukwaRecordKey, ChukwaRecord> as Input for 
further processing.

Now the final sequence files which I am getting is of format <ChukwaArchiveKey, 
ChunkImpl> generated in Final Archives folder.
But instead of this I want my final output as of format :<ChukwaRecordKey, 
ChukwaRecord> generated by demuxer to do further processing.

Please correct me if my understanding is not right. Logs will be archived first 
and then output of Archiver will be input to Demuxer to generate final output. 
According to this I must get <ChukwaRecordKey, ChukwaRecord> as a final file.

Is there any configuration settings to be done on Chukwa side to achive desired 
output.

No problem in Step 2.

What should be the correct behavior of this whole process. Any pointers 
regarding this would be helpful

Thanks in advance,
Stuti

-----Original Message-----
From: Jerome Boulon [mailto:jbou...@netflix.com]
Sent: Monday, May 17, 2010 10:09 PM
To: chukwa-user@hadoop.apache.org
Subject: Re: Problem in Output file format in FinalArchives

Hi Stuti,

There's 2 output in Chukwa.
1- Collectors are writing SeqFile in this format: <ChukwaArchiveKey and
ChunkImpl>
1.1- Archives are in the same format format: <ChukwaArchiveKey and
ChunkImpl>
2- Demux output is in this format:<ChukwaRecordKey, ChukwaRecord>

So if you want to have your Demux output in Avro format then you need to
have your own AvroOutputFormat in Demux.
I've already done some work to be able to use any Hadoop output format at
the demux level by I haven't publish my code yet.

What is your time range?

/Jerome.


On 5/16/10 9:58 PM, "Stuti Awasthi" <stuti_awas...@persistent.co.in> wrote:

>
>
> Hello Guys,
>
>
> I am a newbie to chukwa and I am trying to convert the chukwa sequence file
> produced by the demuxer(<ChukwaRecordKey, ChukwaRecord>) file format  to avro
> format. Currently I am using Chukwa 0.3.0
>
> Could setup and run chukwa successfully on a ubuntu machine, the agent and
> collector were started successfully and files were created in  finalArchives
> folder.
>
> The output format of the files in FinalArchives is of type <ChukwaArchiveKey
> and ChunkImpl> but according to the chukwa document and my findings I think
> that files should be of format <ChukwaRecordKey, ChukwaRecord>.
>
> I used  chukwa-data-processors.sh to start the dataprocessor and change the
> chukwa-demux-conf.xml property to Stream.
> <property>
> <name>archive.grouper</name>
> <value>Stream</value>
> <description>How to group archive files. Choices are Hourly, Daily, DataType,
> and Stream.</description>
> </property>
>
>   I looked into the source code of demux.java which takes <ChukwaArchiveKey
> and ChunkImpl> (o/p of archiver) as input and gives <ChukwaRecordKey,
> ChukwaRecord> as output. But I am not sure why it is not happening in my case.
>
> I want to feed my final output to the MetricDataLoader class which takes
> <ChukwaRecordKey, ChukwaRecord> as Input, Please let me know if I am missing
> something here.
>
> What should be the correct behavior of this whole process. Any pointers
> regarding this would be helpful
>
>
> Thanks in advance,
>
> Stuti
>
>
>
>
> DISCLAIMER
> ==========
> This e-mail may contain privileged and confidential information which is the
> property of Persistent Systems Ltd. It is intended only for the use of the
> individual or entity to which it is addressed. If you are not the intended
> recipient, you are not authorized to read, retain, copy, print, distribute or
> use this message. If you have received this communication in error, please
> notify the sender and delete all copies of this message. Persistent Systems
> Ltd. does not accept any liability for virus infected mails.
>


DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the 
property of Persistent Systems Ltd. It is intended only for the use of the 
individual or entity to which it is addressed. If you are not the intended 
recipient, you are not authorized to read, retain, copy, print, distribute or 
use this message. If you have received this communication in error, please 
notify the sender and delete all copies of this message. Persistent Systems 
Ltd. does not accept any liability for virus infected mails.

Reply via email to