Hi All,

I am planning to evaluate different event stream clustering algorithms as part 
of my studies(I am a graduate student at indiana University). I think Siddhi is 
a good place to experiment this, As per my understanding based on the docs 
Siddhi doesn't have a stream clustering interface I can use directly to plug my 
own algorithm. So I am thinking of first come up an interface for different 
clustering algorithms and add implementation of algorithms for each event 
stream by invoking an operation like SiddhiManager.addQuery. Or I can make the 
algorithm configure as part of query language. If the second option is more 
consistent with current model I can wrap-up the work in that way but initially 
focussing on first approach will be easier for me. So each algorithm can be 
associated to a desired event Stream or can be associated globally. If its 
associated with each stream algorithm will run local to each stream otherwise 
it will run in global context. Based on the algorithm I can provide a way to 
configure it with parameters.

To start this I hope to implement a frequent item set mining algorithm which 
can be used to find out most frequent items of an event stream. Search engines 
use these kind of data to find out most frequent searches in a given time 
window and optimize the search queries. I can start with some algorithms like 
Misra-Gries algorithm[1] and Manku and      Motwani [2] and then move towards 
more of data clustering algorithms. For the time being I will write the 
clustering results in to a file and later I think I can use more stable storage 
(either wso2 registry or other prefered way in wso2 product stack). If Siddhi 
or WSO2 CEP already have the capability of frequent item mining I will start 
with a more classification type algorithm.

Your feedback will be very useful for my work. If you have requirement for any 
specific type of algorithms based on the real client interactions you have, I 
would like to know them and implement them with Siddhi and do the comparison.

Thanks
Lahiru
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to