Hi all, As we discussed with stakeholders, this frequency implementation will not be continued further using stream-lib library. Thanks for everyone for your valuable response.
On Sat, Oct 18, 2014 at 10:43 PM, Ginnaliya Gamathige, Lahiru Manananda < [email protected]> wrote: > Hi Rajeevan, > > There is already some algorithms to count the most frequent events(but > not based on each attribute). You can have a look in to that[1] and might > be helpful for your implementation. > > [1]https://wso2.org/jira/browse/CEP-872 > > Regards > Lahiru > On Oct 17, 2014, at 7:02 AM, Rajeevan Vimalanathan <[email protected]> > wrote: > > We are planning to integrate Frequency algorithm as a part of training > project[1] with Siddhi CEP. > > Basically this is the algorithm calculates the number of occurrences > (frequency) of a specified attribute for a given input stream in CEP. > > We have selected a Siddhi Transformer to implement this functionality by > using stream-lib[2] as a third party library which is licensed under Apache > Software Foundation. > > Standard Siddhi query for using this algorithm would look like below, > > *from inputStream#transform.custom:getFrequencies(desiredAttribute) > select desiredAttribute, frequency* > *insert into frequencyStream;* > > Where, > > inputStream : Input Stream to CEP > > custom : namespace > > getFrequencies : function name > > desiredAttribute : Attribute name from input stream for which frequencies > need to be calculated > frequencyStream : Output Stream from CEP that contains frequency related > information > > The stream-lib library supports only Top-K and cardinality algorithms > directly where the Top-K algorithm takes ‘K’ value as an argument from user > and gives distinct K number of elements which have highest frequency values > with related frequency values. The library provides no functions for > getting frequencies of all elements. So what we are planning to do is > giving a maximum integer value(Integer.MAX_VALUE) as an argument to the > Top-K algorithm. So obviously, we will be able to get frequencies for all > distinct event attributes provided that distinct event attribute count does > not exceed Integer.MAX_VALUE value. > > There won’t be any memory issues as giving of Integer.MAX_VALUE for Top-K > algorithm because it is increasing it’s bucket size dynamically as new > distinct events come. > > We have already implemented the above design and basic testings seem to be > ok. > > Kindly comment on the implementation. > > [1] - https://redmine.wso2.com/issues/2884 > > [2] - https://github.com/addthis/stream-lib > > -- > Best Regards, > V.Rajeevan > Software Engineer, > WSO2 Inc. :http://wso2.com > > Mobile : +94 773090875 > Email : [email protected] > _______________________________________________ > Architecture mailing list > [email protected] > https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture > > > > _______________________________________________ > Architecture mailing list > [email protected] > https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture > > -- Best Regards, V.Rajeevan Software Engineer, WSO2 Inc. :http://wso2.com Mobile : +94 773090875 Email : [email protected]
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
