Hi Vijay, There are two consolidation (down sampling) framework in Chukwa. The first implementation was done in mysql, where we have a scheduler to periodically execute sql queries (aggregator.sql) to normalize the data, and rewrite directly into mysql. The second implementation is currently under construction in pig. There are still some problems (like join) to over come for the second implementation.
In essence, we would like to have a scheduler to periodically execute a batch job to normalize the data. Let's wait and see how the second implementation turns out by end of this month. We could implement a direct map/reduce consolidator after we learn something at a later time. Regards, Eric On 6/8/09 11:39 AM, "Vijay" <[email protected]> wrote: > Hi, > Have a question on the Aggregation, I notice the current Chukwa > drive doesn't do any consolidation it just stores all the data into the Date > partitions like month days and weeks... > Consolidation means to me: Adding or averaging the data.... > > In Hbase we don't need to partition the data because it can hold a large > data set.... but my question is are guys working on Consolidation or > something? > > Coz i need that functionality and planning to add those in the post process. > > Regards, > </VJ>
