On the initial table *Union* is -> sending results to one stream/process in Siddhi
IMHO we don't need to explicitly differentiate realtime and batch processing, but internally we should be switching to batch mode when we need to process huge dataset, may be we can directly call hector from Siddhi windows, This will be very useful for most of the monitoring usecases, E.g User may need results per Sec, per Day, per Month and per Year Which involves most of our BAM use cases. And we should not confuse Big Data processing vs realtime and batch based analysis over a window. We can still have Hive to support edge cases. Regards, Suho On Tue, May 6, 2014 at 3:13 PM, Srinath Perera <[email protected]> wrote: > Hi Paul, > > I think it as a two stage process. I think there is lot to gain from > common language even if it is two runtimes inside. > > If we are to unify the runtimes, then there is lot of problems to be > solved that MapReduce has solved. I think running batch case off Storm > would affect performance as MapReduce move data as large files, which works > best for the network. > > I think It is possible unify runtimes, but we have to think more about > that. > > I think meanwhile, Siddhi to Hive conversion should be much simpler. > > --Srinath > > > > > On Tue, May 6, 2014 at 2:52 PM, Paul Fremantle <[email protected]> wrote: > >> Srinath >> >> Are you assuming that we would convert Siddhi Lang into Hive SQL and then >> run that? Is there any way we can make a version of the Siddhi engine that >> runs as a batch job under Hadoop instead? >> >> Paul >> >> >> On 6 May 2014 10:18, Srinath Perera <[email protected]> wrote: >> >>> Hi All, >>> >>> I have thought about Paul's idea at the product council to unify CEP and >>> BAM Languages. Look like it can work. >>> >>> 1. Users can write all the queries using Siddhi language >>> 2. If the windows defined in queries are large (e.g. say more than 15 >>> minutes for batch windows or more than 15 minutes slides for sliding >>> windows), the system will automatically generate Hive scripts and run the >>> scripts in MapReduce. >>> 3. If not queries get executed via CEP >>> 4. Incoming data can marge with data stored (e.g. Database or a flat >>> file), via event tables. We will have to do some work to make it work >>> seamlessly. >>> 5. If you combine smaller and larger windows, system should work using >>> CEP and MapReduce side by side. >>> >>> As far as I can tell, anything can be done with Hive script can be done >>> with Siddhi language. >>> >>> >>> BAM >>> >>> CEP >>> >>> Retrieve All >>> >>> from S1 >>> >>> Retrive Some >>> >>> from S1[condition] >>> >>> Projection >>> >>> from .. select >>> >>> Sort >>> >>> Have to implement via sort window >>> >>> Group By >>> >>> via partitions or via group by >>> >>> transform >>> >>> transform function TBD >>> >>> Join >>> >>> Join with right windows >>> >>> Union >>> >>> ? >>> >>> Map/Reduce >>> >>> parition + queries => map >>> >>> send results to one stream, process => reduce >>> >>> There are few builtin functions missing like sin() .. that we can >>> easily add. >>> >>> Pros >>> === >>> One language >>> Cleaner model for both batch and realtime analytics >>> >>> Cons >>> ==== >>> This does not work for "data copied as flat files". Such files need to >>> be replayed, which may be expensive. >>> >>> Thoughts please. Would that work? >>> >>> Thanks >>> Srinath >>> >>> -- >>> ============================ >>> Srinath Perera, Ph.D. >>> Director, Research, WSO2 Inc. >>> Visiting Faculty, University of Moratuwa >>> Member, Apache Software Foundation >>> Research Scientist, Lanka Software Foundation >>> Blog: http://srinathsview.blogspot.com/ >>> Photos: http://www.flickr.com/photos/hemapani/ >>> Phone: 0772360902 >>> >> >> >> >> -- >> Paul Fremantle >> CTO and Co-Founder, WSO2 >> OASIS WS-RX TC Co-chair, Apache Member >> >> UK: +44 207 096 0336 >> US: +1 646 595 7614 >> >> blog: http://pzf.fremantle.org >> twitter.com/pzfreo >> [email protected] >> >> wso2.com Lean Enterprise Middleware >> >> Disclaimer: This communication may contain privileged or other >> confidential information and is intended exclusively for the addressee/s. >> If you are not the intended recipient/s, or believe that you may have >> received this communication in error, please reply to the sender indicating >> that fact and delete the copy you received and in addition, you should not >> print, copy, retransmit, disseminate, or otherwise use the information >> contained in this communication. Internet communications cannot be >> guaranteed to be timely, secure, error or virus-free. The sender does not >> accept liability for any errors or omissions. >> > > > > -- > ============================ > Srinath Perera, Ph.D. > Director, Research, WSO2 Inc. > Visiting Faculty, University of Moratuwa > Member, Apache Software Foundation > Research Scientist, Lanka Software Foundation > Blog: http://srinathsview.blogspot.com/ > Photos: http://www.flickr.com/photos/hemapani/ > Phone: 0772360902 > > _______________________________________________ > Architecture mailing list > [email protected] > https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture > > -- *S. Suhothayan* Technical Lead & Team Lead of WSO2 Complex Event Processor *WSO2 Inc. *http://wso2.com * <http://wso2.com/>* lean . enterprise . middleware *cell: (+94) 779 756 757 | blog: http://suhothayan.blogspot.com/ <http://suhothayan.blogspot.com/>twitter: http://twitter.com/suhothayan <http://twitter.com/suhothayan> | linked-in: http://lk.linkedin.com/in/suhothayan <http://lk.linkedin.com/in/suhothayan>*
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
