Thanks, Javier, here is the data records I am receiving:
 id  | sensor_id |  timestamp  |  period | current | date_received | 

so basically what I understand,  each tuple emitted, including all fields 
above, but some records are missing in terms of sequential timestamp, for 
example, I should receive the records every minute
2015-08-11T17:01:49
2015-08-11T17:01:50
2015-08-11T17:01:51
2015-08-11T17:01:52
.
.
. 

however, I may get such type of data, 
2015-08-11T17:01:49
2015-08-11T17:01:50
2015-08-11T17:01:53

I must find the missing records corresponding to 2 timestamps between 50 and 
53, and I will estimate the miss current value by average the 01:49 and 01:53 
current values.

I am not sure if I explain clearly, thanks

AL

> On Aug 11, 2015, at 3:35 PM, Javier Gonzalez <[email protected]> wrote:
> 
> Just to make sure I'm understanding correctly: Do you have a single stream of 
> sequential ids or multiple streams that need to be interpolated? Do you 
> receive a stream of ids and emit a stream of timestamped ids?
> 
> On Aug 11, 2015 5:34 PM, "Alec Lee" <[email protected] 
> <mailto:[email protected]>> wrote:
> Hello, all
> 
> Here I have a question about storm doing analytics, I have a data stream 
> coming in in real-time, each record associates a timestamp, it supposes to be 
> ingested every 1 second from devices, but we know some records are missing, 
> say, timestamp1, timestamp2, timestamp5, here timestamp3 and 4 records are 
> missing. How can I identify these missing records, what I need to find out 
> what records are missed base on the sequential timestamp, and estimate the 
> missing values in terms of last record, and next record, i can make the 
> average as this missing value. And output of this bolt will be a consecutive 
> of data with no missing records.
> 
> 
> Thanks
> 
> 
> Al

Reply via email to