Re: [Architecture] A few questions about WSO2 CEP/Siddhi

Leo Romanoff Mon, 10 Mar 2014 04:23:25 -0700

One more thing:

It could be very, very useful, if Siddhi/WSO2 CEP would have a 
description/document like this:


http://esper.codehaus.org/tutorials/solution_patterns/solution_patterns.html


This document covers many generic CEP use-cases and then shows how they can be 
solved using Esper. 

May be Siddhi could take this document as a basis and provide a Siddhi-specific 
version of it reflecting which of described CEP use-cases could be 
expressed/implemented with Siddhi and how? Also stating what is not possible 
currently or what is planned could be very useful. 

I think it would be very valuable for users, because right now due to lack of 
information it is pretty difficult to figure out what is possible to do with 
Siddhi and what are its current limitations. It could be also useful for Siddhi 
developers, because it may highlight certain features that are missing 
currently, but are essential for implementing certain classes of scenarios.

I hope this proposal makes sense.

Best Regards,
  -Leo


Leo Romanoff <romix...@yahoo.com> schrieb am 11:23 Montag, 10.März 2014:
 
Hi all,
>
>
>First of all, thank you very much for your explanations and clarifications! It 
>is very interesting and useful!
>
>
>Let me ask a few more questions and provide a few comments.
>
>
>> Hi All, these questions and answers are very educating. Shall we add them to 
>> our doc FAQs? 
>
>
>I think it would be a very good idea to add something like this to the FAQs or 
>to create some sort of an "architecture and implementation overview" document.
>
>
>1) How many rules/queries can be defined in one engine. How does it affect 
>performance?
>>
>>   For example, can I define (tens of) thousands of queries using the same 
>>(or multiple) instance of SiddhiManager? Would it make processing much 
>>slower? Or is the speed not proportional to the number of queries? E.g. when 
>>a new event arrives, does Siddhi test it in a linear fashion against each 
>>query or does Siddhi keep an internal state machine that tries to match an 
>>event against all rules at once?
>>
>
>
>> SiddhiManager can have many queries, and if you chain the queries in a liner 
>> fashion then all those queries will be executed 
>> one after the other and you might see some performance degradation, but if 
>> you have have then parallel then there wont be 
>
>> any issues.   
>
>
>
>Well, before I got this answer, I created a few test-cases to check 
>experimentally how it behaves. I created a single instance of a SiddhiManager, 
>added 10000 queries that all read from the same input stream, check if a 
>specific attribute (namely, price) of an event is inside a given random 
>interval ( [ price >= random_low and price <= random_high] ) and output into 
>randomly into one of 100 streams. Then I measured the time required to process 
>1000000 events using this setup. I also did exactly the same experiment with 
>Esper.
>
>
>My findings were that Siddhi is much slower than Esper in this setup. After 
>looking into the internal implementations of both, I realized the reason. 
>Siddhi processes all queries that read from the same input stream in a linear 
>fashion, sequentially. Even if many of the queries have almost the same 
>condition, no optimization attempts are done by Siddhi. Esper detects that 
>many queries have a condition on the same variable and create some sort of a 
>decision tree. As a result, their running time in log N, where as Siddhi needs 
>O(n). 
>
>
>I'm not saying that this test-case if very typical or important, but may be 
>Siddhi should try to analyze the complete set of queries and try to apply some 
>optimizations, when it is possible? I.e. it is a bit of a global optimization 
>applied. It could detect some common sub-expressions or sub-conditions in the 
>queries and evaluate them only once, instead of doing it over and over again 
>by evaluating each query separately.
>
>
>After getting these first results, I changed the setup, so that each query 
>uses one of many input streams (e.g. one of 300) instead of using the same 
>one. This greatly improved the situation, because now the number of queries 
>per input stream was much smaller and thus processing was way faster. But even 
>in this setup it is still about 5-6 times slower than Esper in this situation.
>
>
>
>
>
>>2) Is it possible to easily disable/enable some queries?
>>
>>In my use-cases I have a lot of queries. Actually, I have a lot of tenants 
>>and each tenant may have something like 10-100 queries. Rather often (e.g. 
>>few times a day), tenants would like to disable/enable some of their queries. 
>>What is a proper way to do it? Is it a costly operation, i.e. does Siddhi 
>>need to perform a lot of processing to disable or enabled a query?
>>Is it better to keep a dedicated SiddhiManager instance per tenant or is it 
>>OK to have one SiddhiManager instance which handles all those tenants with 
>>all their queries?
>>
>>
>> The general norm is, you have to use a SiddhiManager per scenario, where 
>> each scenario might contain one or more queries, 
>> with this modal its easy if any tenant want to add a remove a scenario and 
>> it will not affect other queries and tenants.
>
>
>If I have tens of thousands of tenants, then having a dedicated SiddhiManager 
>per tenant is probably not very practical or even possible, as it will get 
>pretty heave weight, I guess.  
>
>
>Therefore, having the ability to enable/disable to query could be very 
>practical. In fact, it could be probably implemented very easily. Imagine that 
>each query object has a boolean flag that indicates if it is enabled or not. 
>If the condition matches and before Siddhi tries to perform the insert, i.e. 
>the action, it could check if the query is disabled. If it is disabled, no 
>action (i.e. insert) is performed at all. Of course, there is still some 
>overhead when matching the query. But may be even this can be skipped if query 
>is disabled? I.e. conditions are immediately evaluated to "false" and thus 
>never trigger?
>
>
>BTW, Esper has this feature. You can disable/enable any query without removing 
> and later adding it again.
>
>
>When it comes to Siddhi persistent stores, you write:
>>It only stores the state information of the processing, E.g the current 
>>running Avg of the average calculation. This will be used >when server 
>>recovers from a failure. 
>
>
>
>OK. I understand what it does now. BTW, does it also store any sliding windows 
>as well so that failover may happen?
>
>
>My further question is: How to support more dynamic scenarios, where the set 
>of queries is not totally static? What if the set of rules changes a few times 
>per hour/day/etc? May be it would also make sense to persist a set of queries 
>that were deployed on a given SiddhiManager? This way a user doesn't need to 
>perform any custom book-keeping for the set of queries. 
>
>
>Yet another question about Siddhi:
>Is it possible to express queries that work with absolute time or timers 
>without providing a time inside events?  E.g. how can one express in the query 
>something like: "time is between 9:30 AM and 10:00 AM"? It is possible to work 
>with timers in the query? Basically, I'd like to trigger certain actions at a 
>specific time or on a regular basis (every N minutes) and I'm wondering how 
>this can be expressed using Siddhi's query language.
>
>
>And my last question for now:
>Is it possible to have nested structures in events, e.g. something like this: 
>"select field1.field12[3].field1234 from ..."? It means that an event has a 
>field called field1, which in turn has an array sub-field called field12, and 
>each element of this array has a field field1234. Is it possible? Or does 
>Siddhi assume a flat structure of events, i.e. each event can have only fields 
>of basic types?
>
>
>
>
>Thanks,
>   Leo
>
>_______________________________________________
>Architecture mailing list
>Architecture@wso2.org
>https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>
>
>

_______________________________________________
Architecture mailing list
Architecture@wso2.org
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Re: [Architecture] A few questions about WSO2 CEP/Siddhi

Reply via email to