There will be a converter from sequence file to other file format. If a new file format has been decided to replace sequence file.
Regards, Eric On 2/26/10 8:04 PM, "James Seigel" <ja...@tynt.com> wrote: > It sounds like there is some exciting work being done on the demux process. > I was just wondering if you are planning to be backwards compatible with 0.3 > format for /repos as you move forward . > > Cheers > james > > > On 2010-02-26, at 10:38 AM, Eric Yang wrote: > >> >> >> >> On 2/26/10 4:43 AM, "Guillermo Pérez" <bi...@tuenti.com> wrote: >> >>> One related thing is that I want to modify the "cluster" where we put >>> the files, because we will receive syslog data with several types of >>> events that we want to store in different clusters to analyze, backup, >>> archive separately. I have seen that you can modify the >>> Record.tagsField and that we use a regexp for extracting the >>> destination cluster. This is a bit akward, isn't? I don't want to keep >>> a tagsField just for that. I'm using a field "event_type" and I have >>> modified the extraction/engine/RecordUtil.java, so if that field >>> exists, "event_" + <event_type> will be used as cluster. This is the >>> proper way to go, or there is a better solution for this?. >> >> I don't think you need to modify RecordUtil.java for this purpose. The >> backfill java program is taking first parameter as cluster. Hence, you >> could easily change event_type as the first parameter before you backfill. >> >>> Another question is where I could start looking on how to build >>> reports and aggregated results of the custom ChukwaRecords I'm >>> inserting. >> >> There is currently no formal solution to generate report from ChukwaRecords. >> There is org.apache.hadoop.chukwa.dataloader.MetricDataLoader which loads >> ChukwaRecords into mysql database base on mdl.xml file. After data is >> loaded, you could use hicc.sh to start the webserver, and visualize the data >> in Chukwa SQL Client widget. However, I must warn you that MetricDataLoader >> is deprecated, and the future plan to generate report from ChukwaRecords is >> as follow: >> >> Having a post demux data loader which wait to receive new ChukwaRecords >> files, and merge with the existing ChukwaRecords files through a second MR >> job. The second MR job also produces low resolution of the data for report. >> >> /chukwa/repos/TYPE/DATE <-- Original data goes here. >> /chukwa/report/TYPE/[yearly,monthly,weekly,daily] <-- Summarized JSON data >> goes here. >> >> The report JSON will be fixed to 300 data points per series, optimized for >> graphing. I am taking it slow on the actual implementation because >> ChukwaRecords should be move to a faster seralization format. It's another >> area that needs to be improved for the future plan to work. >> >> Regards, >> Eric >> > > James Seigel > ja...@tynt.com > http://www.tynt.com > Captain Hammer > >