Re: how to generate a Chukwa SequenceFile

Bill Graham Tue, 19 Jan 2010 16:53:38 -0800

Thanks Ari, that helps. The TempFileUtil.writeASinkFile method seems similar
to what I want actually.


>From looking at the code though it seems that a sink file contains
ChukwaArchiveKey -> ChunkImpl key value pairs, but a processed file instead
contains ChukwaRecordKey -> ChukwaRecord pairs.

If I followed that code as an example, but just created the latter k/v pairs
instead of the former I'd be good to go, correct?


On Tue, Jan 19, 2010 at 3:59 PM, Ariel Rabkin <asrab...@gmail.com> wrote:

> There isn't a polished utility for this, and there should be.  I think
> it'll be entirely straightforward, depending on your specific
> requirements.
>
> If you look in org.apache.hadoop.chukwa.util.TempFileUtil.RandSeqFileWriter
> there's an example of code that writes out a sequence file for test
> purposes.
>
> --Ari
>
> On Tue, Jan 19, 2010 at 3:46 PM, Bill Graham <billgra...@gmail.com> wrote:
> > Hi,
> >
> > Is there an easy way (maybe using a utility class or the chukwa API) to
> > manually create a sequence file of chukwa records from a log file without
> > the need for HDFS?
> >
> > My use case is this: I've got pig unit tests that read input sequence
> file
> > input using ChukwaStorage from local disk. I generated these files by
> > putting data into the cluster an waiting for the data processor to run.
> > We're looking to change the log format though, and I'd like to be able to
> > write and run the unit tests without putting the new data into the
> cluster.
> >
> > If there were a command line way that I could do this that would be very
> > helpful. Or if anyone could point me to the relevant classes, I could
> write
> > such a utility and contribute it back.
> >
> > thanks,
> > Bill
> >
>
>
>
> --
> Ari Rabkin asrab...@gmail.com
> UC Berkeley Computer Science Department
>

Re: how to generate a Chukwa SequenceFile

Reply via email to