Hi, In our product, we use Trident to do real-time aggregations at 5 minute intervals with persistantAggregate and a state implementation of the RDBMS implementation does the multi-puts into the RDBMS. The system has the RDBMS doing higher level rollup aggregations like 15 mins, 1 hour, 1 day etc on top of the lowest level *5 minutes) updated by Trident. When some records arrive late and still have to be aggregated, I can use the same persistentAggregate methodology to update the corresponding aggregate rows in the RDBMS, no issues. But, if the higher level aggregations have already been completed for the late-arriving data (for eg. data comes 2 hrs late which means 8 fifteen minute aggregations and 2 hour aggregations have already been done by RDBMS), how can we update that. One idea Im thinking of is to use the same persistentAggregate methodology to update the higher level aggregates as well in RDBMS as and when the late arrival record is processed. Done that way, the RDBMS takes care of heavy lifting across huge data for roll up aggregations. Storm does lowest level aggregation in real-time and also does update of roll-up aggregations in RDBMS for late arrival data. This way, the late arrival handling process is extremely simplified. Will this logic perform well? Any suggestions/improvements? Thanks & Regards MK
