Hi All,
I am planning to evaluate different event stream clustering algorithms as part
of my studies(I am a graduate student at indiana University). I think Siddhi is
a good place to experiment this, As per my understanding based on the docs
Siddhi doesn't have a stream clustering interface I can use directly to plug my
own algorithm. So I am thinking of first come up an interface for different
clustering algorithms and add implementation of algorithms for each event
stream by invoking an operation like SiddhiManager.addQuery. Or I can make the
algorithm configure as part of query language. If the second option is more
consistent with current model I can wrap-up the work in that way but initially
focussing on first approach will be easier for me. So each algorithm can be
associated to a desired event Stream or can be associated globally. If its
associated with each stream algorithm will run local to each stream otherwise
it will run in global context. Based on the algorithm I can provide a way to
configure it with parameters.
To start this I hope to implement a frequent item set mining algorithm which
can be used to find out most frequent items of an event stream. Search engines
use these kind of data to find out most frequent searches in a given time
window and optimize the search queries. I can start with some algorithms like
Misra-Gries algorithm[1] and Manku and Motwani [2] and then move towards
more of data clustering algorithms. For the time being I will write the
clustering results in to a file and later I think I can use more stable storage
(either wso2 registry or other prefered way in wso2 product stack). If Siddhi
or WSO2 CEP already have the capability of frequent item mining I will start
with a more classification type algorithm.
Your feedback will be very useful for my work. If you have requirement for any
specific type of algorithms based on the real client interactions you have, I
would like to know them and implement them with Siddhi and do the comparison.
Thanks
Lahiru
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture