Thanks Ari, that helps. The TempFileUtil.writeASinkFile method seems similar to what I want actually.
>From looking at the code though it seems that a sink file contains ChukwaArchiveKey -> ChunkImpl key value pairs, but a processed file instead contains ChukwaRecordKey -> ChukwaRecord pairs. If I followed that code as an example, but just created the latter k/v pairs instead of the former I'd be good to go, correct? On Tue, Jan 19, 2010 at 3:59 PM, Ariel Rabkin <asrab...@gmail.com> wrote: > There isn't a polished utility for this, and there should be. I think > it'll be entirely straightforward, depending on your specific > requirements. > > If you look in org.apache.hadoop.chukwa.util.TempFileUtil.RandSeqFileWriter > there's an example of code that writes out a sequence file for test > purposes. > > --Ari > > On Tue, Jan 19, 2010 at 3:46 PM, Bill Graham <billgra...@gmail.com> wrote: > > Hi, > > > > Is there an easy way (maybe using a utility class or the chukwa API) to > > manually create a sequence file of chukwa records from a log file without > > the need for HDFS? > > > > My use case is this: I've got pig unit tests that read input sequence > file > > input using ChukwaStorage from local disk. I generated these files by > > putting data into the cluster an waiting for the data processor to run. > > We're looking to change the log format though, and I'd like to be able to > > write and run the unit tests without putting the new data into the > cluster. > > > > If there were a command line way that I could do this that would be very > > helpful. Or if anyone could point me to the relevant classes, I could > write > > such a utility and contribute it back. > > > > thanks, > > Bill > > > > > > -- > Ari Rabkin asrab...@gmail.com > UC Berkeley Computer Science Department >