Hi, All Regarding the average and standard deviation of a stream from a specific sensor, these two variables can be computed incrementally and take constant time to update. So, I do not see the burden even if the implementation is trivial. And the distributed stream processing looks like redundant for only hundreds of streams.
Storm is a cluster based distributed data processing rather than a decentralized system like sensor network. Whether it is applicable for your scenario depends on where you deploy it inside your architecture. Best, 2014-09-03 8:59 GMT+02:00 Vikas Agarwal <[email protected]>: > Hi Yuheng, > > We are also exploring/implementing for analyzing stream of messages > (twitter stream and other sources). With my short experience, one thing I > came know is that a lot would depend on the parallelism of the spouts in > your topology, so you can parallelize the ingestion of data using > partitioning or similar stuff, you can benefit from storm definitely > otherwise you would see lot of failed messages which may accumulate a large > backlog of such overflowing input data. > > > On Wed, Sep 3, 2014 at 1:01 AM, Yuheng Du <[email protected]> > wrote: > >> Hi guys, >> >> I have a stream of sensor data coming from rabbitmq. For each sensor >> message, it is of the JSON format and have the following fields: >> >> deviceId: "BOT-N3" >> reading0: 2.25 >> reading1: 3.78 >> .... >> readingN: -1.35 >> >> each float number of readingN represents a sensor reading on a specific >> field location. >> >> Now for each incoming message, I want to do a query which gives me the >> average and standard deviation of a certain 'deviceId' 's 'readingN' over a >> custom time range (a year ago to now, a month ago to now, etc). So if N=28, >> for each incoming message I will need to do 28 queries on the historic data >> at almost the same time. I need the query results to be returned in near >> real time so the other incoming messages won't get blocked. >> >> Is STORM a good solution to this issue? >> >> I have tried Elasticsearch-Logstash-Kibana stack already, It seems that >> when the incoming message rates are high, The messages will be blocked >> since the ES server can't correspond to hundreds of query requesst at >> the same time. >> >> Will STORM help me in this case? What is the common use case of STORM in >> processing real-time sensor data (coming from sensor network specifically)? >> >> Thanks! >> >> best >> >> Yuheng >> > > > > -- > Regards, > Vikas Agarwal > 91 – 9928301411 > > InfoObjects, Inc. > Execution Matters > http://www.infoobjects.com > 2041 Mission College Boulevard, #280 > Santa Clara, CA 95054 > +1 (408) 988-2000 Work > +1 (408) 716-2726 Fax > >
