We are planning to integrate Frequency algorithm as a part of training project[1] with Siddhi CEP.
Basically this is the algorithm calculates the number of occurrences (frequency) of a specified attribute for a given input stream in CEP. We have selected a Siddhi Transformer to implement this functionality by using stream-lib[2] as a third party library which is licensed under Apache Software Foundation. Standard Siddhi query for using this algorithm would look like below, * from inputStream#transform.custom:getFrequencies(desiredAttribute)select desiredAttribute, frequency* *insert into frequencyStream;* Where, inputStream : Input Stream to CEP custom : namespace getFrequencies : function name desiredAttribute : Attribute name from input stream for which frequencies need to be calculated frequencyStream : Output Stream from CEP that contains frequency related information The stream-lib library supports only Top-K and cardinality algorithms directly where the Top-K algorithm takes ‘K’ value as an argument from user and gives distinct K number of elements which have highest frequency values with related frequency values. The library provides no functions for getting frequencies of all elements. So what we are planning to do is giving a maximum integer value(Integer.MAX_VALUE) as an argument to the Top-K algorithm. So obviously, we will be able to get frequencies for all distinct event attributes provided that distinct event attribute count does not exceed Integer.MAX_VALUE value. There won’t be any memory issues as giving of Integer.MAX_VALUE for Top-K algorithm because it is increasing it’s bucket size dynamically as new distinct events come. We have already implemented the above design and basic testings seem to be ok. Kindly comment on the implementation. [1] - https://redmine.wso2.com/issues/2884 [2] - https://github.com/addthis/stream-lib -- Best Regards, V.Rajeevan Software Engineer, WSO2 Inc. :http://wso2.com Mobile : +94 773090875 Email : [email protected]
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
