Here's a JIRA with a patch. Let me know if you think I should refactor any parts of it:
https://issues.apache.org/jira/browse/CHUKWA-449 On Tue, Jan 19, 2010 at 6:03 PM, Ariel Rabkin <asrab...@gmail.com> wrote: > Yes, if by processing you mean "demux". Which should be renamed, I > think, at some point. > > --Ari > > On Tue, Jan 19, 2010 at 4:53 PM, Bill Graham <billgra...@gmail.com> wrote: > > Thanks Ari, that helps. The TempFileUtil.writeASinkFile method seems > similar > > to what I want actually. > > > > From looking at the code though it seems that a sink file contains > > ChukwaArchiveKey -> ChunkImpl key value pairs, but a processed file > instead > > contains ChukwaRecordKey -> ChukwaRecord pairs. > > > > If I followed that code as an example, but just created the latter k/v > pairs > > instead of the former I'd be good to go, correct? > > > > > > On Tue, Jan 19, 2010 at 3:59 PM, Ariel Rabkin <asrab...@gmail.com> > wrote: > >> > >> There isn't a polished utility for this, and there should be. I think > >> it'll be entirely straightforward, depending on your specific > >> requirements. > >> > >> If you look in > >> org.apache.hadoop.chukwa.util.TempFileUtil.RandSeqFileWriter > >> there's an example of code that writes out a sequence file for test > >> purposes. > >> > >> --Ari > >> > >> On Tue, Jan 19, 2010 at 3:46 PM, Bill Graham <billgra...@gmail.com> > wrote: > >> > Hi, > >> > > >> > Is there an easy way (maybe using a utility class or the chukwa API) > to > >> > manually create a sequence file of chukwa records from a log file > >> > without > >> > the need for HDFS? > >> > > >> > My use case is this: I've got pig unit tests that read input sequence > >> > file > >> > input using ChukwaStorage from local disk. I generated these files by > >> > putting data into the cluster an waiting for the data processor to > run. > >> > We're looking to change the log format though, and I'd like to be able > >> > to > >> > write and run the unit tests without putting the new data into the > >> > cluster. > >> > > >> > If there were a command line way that I could do this that would be > very > >> > helpful. Or if anyone could point me to the relevant classes, I could > >> > write > >> > such a utility and contribute it back. > >> > > >> > thanks, > >> > Bill > >> > > >> > >> > >> > >> -- > >> Ari Rabkin asrab...@gmail.com > >> UC Berkeley Computer Science Department > > > > > > > > -- > Ari Rabkin asrab...@gmail.com > UC Berkeley Computer Science Department >