When considering the usecases for this, getting the frequency during a particular time period will be very useful. (eg:- trading frequency of a different stocks/products during the last 5 hours) Does your custom transformer support this? Or does it always take all events for the frequency calculations?
On Fri, Oct 17, 2014 at 4:32 PM, Rajeevan Vimalanathan <[email protected]> wrote: > We are planning to integrate Frequency algorithm as a part of training > project[1] with Siddhi CEP. > > Basically this is the algorithm calculates the number of occurrences > (frequency) of a specified attribute for a given input stream in CEP. > > We have selected a Siddhi Transformer to implement this functionality by > using stream-lib[2] as a third party library which is licensed under Apache > Software Foundation. > > Standard Siddhi query for using this algorithm would look like below, > > * from > inputStream#transform.custom:getFrequencies(desiredAttribute)select > desiredAttribute, > frequency* > *insert into frequencyStream;* > > Where, > > inputStream : Input Stream to CEP > > custom : namespace > > getFrequencies : function name > > desiredAttribute : Attribute name from input stream for which frequencies > need to be calculated > frequencyStream : Output Stream from CEP that contains frequency related > information > > The stream-lib library supports only Top-K and cardinality algorithms > directly where the Top-K algorithm takes ‘K’ value as an argument from user > and gives distinct K number of elements which have highest frequency values > with related frequency values. The library provides no functions for > getting frequencies of all elements. So what we are planning to do is > giving a maximum integer value(Integer.MAX_VALUE) as an argument to the > Top-K algorithm. So obviously, we will be able to get frequencies for all > distinct event attributes provided that distinct event attribute count does > not exceed Integer.MAX_VALUE value. > > There won’t be any memory issues as giving of Integer.MAX_VALUE for Top-K > algorithm because it is increasing it’s bucket size dynamically as new > distinct events come. > > We have already implemented the above design and basic testings seem to be > ok. > > Kindly comment on the implementation. > > [1] - https://redmine.wso2.com/issues/2884 > > [2] - https://github.com/addthis/stream-lib > > -- > Best Regards, > V.Rajeevan > Software Engineer, > WSO2 Inc. :http://wso2.com > > Mobile : +94 773090875 > Email : [email protected] > > _______________________________________________ > Architecture mailing list > [email protected] > https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture > >
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
