Hi,

We have worked on implementing TopK algorithm (as a CEP window extension)
which is a part of the training project [1].
We need to check whether the design of the implementation is acceptable on
functional/performance point of view.

1.Our final TopK siddhi query will look like below.

*from eventStream#window.custom:topKfrequency("3","1",symbol) *
*select symbol*
*insert into topkResultRtream;*

*eventStream* - input stream
*topkResultStream* - output stream
*topKfrequency* - topK window calling function. This has three arguments.
First one is the capacity(size) of the window , second one is update rate
of the window ("1" means, updating the window per single event) and third
one is the parameter from input stream which we need to consider for TopK
algorithm

2.We are using streamlib library [2] for TopK algorithm (and for other
sketching algorithms as well). What we did was, importing the built library
.jar file for TopK window extension class and calling the TopK method from
it. This method will return the set of symbols which has the maximum
occurring frequency.

3.Within the uniquewindow in extension class, we are keeping the latest
TopK events.

4.The window extension class has two window implementations, which are the
main uniquewindow and a temporary uniquewindow. This temporary window is
added in order to maintain the correct order of TopK results as it is.

We have already implemented the above design and basic testings seem to be
ok.

Kindly comment on the implementation.

[1] - https://redmine.wso2.com/issues/2884
[2] - https://github.com/addthis/stream-lib

BR

*Asok Aravinda Perera*
Software Engineer
WSO2, Inc.;http://wso2.com/
<http://www.google.com/url?q=http%3A%2F%2Fwso2.com%2F&sa=D&sntz=1&usg=AFQjCNGJuLRux6KkJwXKVUCYOtEsNCmIAQ>
lean.enterprise.middleware

Mobile: +94722241032
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to