+1
I like the idea, only worried about the use of *#window.batch()*
How about using *WeatherStream#pull#window.time(24h) *instead?

Suho

On Fri, Nov 21, 2014 at 8:33 PM, Srinath Perera <[email protected]> wrote:

> Hi All,
>
>
> I think I have finally figured out a way do the Paul's idea (See the
> thread "Unifying CEP and BAM Languages"
> http://mail.wso2.org/mailarchive/architecture/2014-May/016366.html for
> more details).
>
>
> Currently event tables in Siddhi are backed by a disk and can be used to
> joins
>
> However, you must use it with a stream and it is triggered from a stream.
> For example currently *from WeatherStream insert into ..* is not defined.
>  *Idea is to use that for batch processing!*
>
> Proposal is to extend event tables definition
>
> 1. Add an timestamp field  to event tables (Then BAM stream also become a
> event table)
>
> 2. *Define “from” operation on event table to cover batch processing*.
> For example
>
> from EvetTable#window.batch(2h) avg(pressure) as avgPressure will run as
> batch job with 2h data using data in the event table. (We will execute this
> in Spark using Siddhi engine). If #window.batch(2h) is omitted,
> #window.batch(5m) is assumed.
>
> 3. You can define an event table on top of DB, disk, Cassandra, hdfs, BAM
> stream stored etc ..
>
> Let me take an example
>
> Say you want to calculate avg and stddev of pressure once every 24h as a
> batch job, and raise an alarm if current pressure is less than 3 stddev
> from the mean calculated in batch process. (this is a extended Lambda
> architecture scenario) Then you can do all this with Siddhi, and query will
> look like following.
>
> *define eventTable WeatherTable using BAMStream … *
>
> *define eventTable  WeatherStatsTable using BAMStream … *
>
> *//batch job*
>
> *from **WeatherTable**#window.batch(24h) stddev(pressure) , avg
> (pressure)*
>
> *     insert into **WeatherStatsTable**; *
>
> *//use results from batch job in realtime *
>
> *from WeatherStream as  p join **WeatherStatsTable**#window.last() as s *
>
> *          on pressure < s.mean -2*s.stddev*
>
> *     insert into WeatherAlarmStream; *
>
> First query runs once every 24 hours and calculate mean and stddev and
> write to disk. That value is joined against the live stream as they come in
> via join.
>
> Few more rules (and there are more details that we need to figure out)
>
> 1. You read from event table, then it runs as batch processes
>
> 2. If you join with event table, it works as now. However, you can define
> a window when joining on top of event tables as well. e.g.
> WeatherStats#window.batch(5m) means takes events came in last 5 mins.
>
> 3. We need to define how it behaves when timestamp field is not defined.
> Best is to only support *joins* and not support *from* in that case.
>
> 4. When processing batches, it runs in parallel if partitions are defined.
> For example, if you want to calculate mean in map reduce style, it will
> look like following.
>
> *define partition StationParition WeatherStream.stationID; *
>
> *from WeatherStream#window.batch(24h) avg(pressure) *
>
> *     insert into WeatherMeanStream using StationParition;  *
>
> If no partitions, it will run sequentially (We can improve on this later).
>
> For execution, we want to init siddhi within Spark, and run thing in
> parallel using spark, but actual evaluation will done by Siddhi.
>
> 5. Users can define other windows like sliding windows with event tables.
> However, Siddhi will read data from disk once every 5 minutes or so. So
> results will be the same, however, it might come bit later than with
> streams.
>
>
> Please comment
>
> Thanks
>
> Srinath
>
>
>
>
> --
> ============================
> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera
> Site: http://people.apache.org/~hemapani/
> Photos: http://www.flickr.com/photos/hemapani/
> Phone: 0772360902
>



-- 

*S. Suhothayan*
Technical Lead & Team Lead of WSO2 Complex Event Processor
 *WSO2 Inc. *http://wso2.com
* <http://wso2.com/>*
lean . enterprise . middleware


*cell: (+94) 779 756 757 | blog: http://suhothayan.blogspot.com/
<http://suhothayan.blogspot.com/>twitter: http://twitter.com/suhothayan
<http://twitter.com/suhothayan> | linked-in:
http://lk.linkedin.com/in/suhothayan <http://lk.linkedin.com/in/suhothayan>*
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to