Hi Rajeevan,

There is already some algorithms to count the most frequent events(but not 
based on each attribute). You can have a look in to that[1] and might be 
helpful for your implementation.

[1]https://wso2.org/jira/browse/CEP-872

Regards
Lahiru
On Oct 17, 2014, at 7:02 AM, Rajeevan Vimalanathan 
<[email protected]<mailto:[email protected]>> wrote:


We are planning to integrate Frequency algorithm as a part of training 
project[1] with Siddhi CEP.

Basically this is the algorithm calculates the number of occurrences 
(frequency) of a specified attribute for a given input stream in CEP.

We have selected a Siddhi Transformer to implement this functionality by using 
stream-lib[2] as a third party library which is licensed under Apache Software 
Foundation.

Standard Siddhi query for using this algorithm would look like below,

from inputStream#transform.custom:getFrequencies(desiredAttribute)
select desiredAttribute, frequency
insert into frequencyStream;

Where,

inputStream : Input Stream to CEP
custom : namespace
getFrequencies : function name
desiredAttribute : Attribute name from input stream for which frequencies need 
to be calculated
frequencyStream : Output Stream from CEP that contains frequency related 
information

The stream-lib library supports only Top-K and cardinality algorithms directly 
where the Top-K algorithm takes ‘K’ value as an argument from user and gives 
distinct K number of elements which have highest frequency values with related 
frequency values. The library provides no functions for getting frequencies of 
all elements. So what we are planning to do is giving a maximum integer 
value(Integer.MAX_VALUE) as an argument to the Top-K algorithm. So obviously, 
we will be able to get frequencies for all distinct event attributes provided 
that distinct event attribute count does not exceed Integer.MAX_VALUE value.

There won’t be any memory issues as giving of Integer.MAX_VALUE for Top-K 
algorithm because it is increasing it’s bucket size dynamically as new distinct 
events come.

We have already implemented the above design and basic testings seem to be ok.

Kindly comment on the implementation.

[1] - https://redmine.wso2.com/issues/2884

[2] - https://github.com/addthis/stream-lib

--
Best Regards,
V.Rajeevan
Software Engineer,
WSO2 Inc. :http://wso2.com<http://wso2.com/>

Mobile : +94 773090875<tel:%2B94%20773090875>
Email : [email protected]<mailto:[email protected]>
_______________________________________________
Architecture mailing list
[email protected]<mailto:[email protected]>
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to