Thanks, Javier, here is the data records I am receiving: id | sensor_id | timestamp | period | current | date_received |
so basically what I understand, each tuple emitted, including all fields above, but some records are missing in terms of sequential timestamp, for example, I should receive the records every minute 2015-08-11T17:01:49 2015-08-11T17:01:50 2015-08-11T17:01:51 2015-08-11T17:01:52 . . . however, I may get such type of data, 2015-08-11T17:01:49 2015-08-11T17:01:50 2015-08-11T17:01:53 I must find the missing records corresponding to 2 timestamps between 50 and 53, and I will estimate the miss current value by average the 01:49 and 01:53 current values. I am not sure if I explain clearly, thanks AL > On Aug 11, 2015, at 3:35 PM, Javier Gonzalez <[email protected]> wrote: > > Just to make sure I'm understanding correctly: Do you have a single stream of > sequential ids or multiple streams that need to be interpolated? Do you > receive a stream of ids and emit a stream of timestamped ids? > > On Aug 11, 2015 5:34 PM, "Alec Lee" <[email protected] > <mailto:[email protected]>> wrote: > Hello, all > > Here I have a question about storm doing analytics, I have a data stream > coming in in real-time, each record associates a timestamp, it supposes to be > ingested every 1 second from devices, but we know some records are missing, > say, timestamp1, timestamp2, timestamp5, here timestamp3 and 4 records are > missing. How can I identify these missing records, what I need to find out > what records are missed base on the sequential timestamp, and estimate the > missing values in terms of last record, and next record, i can make the > average as this missing value. And output of this bolt will be a consecutive > of data with no missing records. > > > Thanks > > > Al
