Charini, My knowledge on the on this domain is sparse. Hence I do not know whether a scenario where time AND length is a valid business case. If it is a valid business case +1 for the design including both parameters in same implementation.
Thanks /Tishan On Thu, Jun 2, 2016 at 12:54 PM, Charini Nanayakkara <[email protected]> wrote: > Hi Tishan, > > Yes. Assuming batch size is 5 and time window is 20 mins, only 5 out of 10 > events which arrive within last 5 mins would be processed due to batch size > constraint (even though all events must be processed if time alone was > considered). Having separate implementations would work on the majority of > the scenarios, since only time OR length is usually applicable but not > both. However, having two implementations would cause trouble in the > situations where both the time factor and length are important (equivalent > to AND operation on the two constraints). If our requirement is to have > only one of the two constraints, we can use a very large value for the > other parameter (i.e. if we only need to limit number of events based on > time = 1 sec constraint, we can specify 1,000,000 for batch size assuming > we have prior knowledge that 1,000,000 events would never arrive within 1 > sec). IMHO neither of the two options (separate or single implementation) > are perfect for every scenario. However having a single implementation > would help address more cases as I understand. What's your opinion on this? > > Thanks > > On Thu, Jun 2, 2016 at 10:14 AM, Charini Nanayakkara <[email protected]> > wrote: > >> Hi All, >> >> I have planned to extend the existent Regression Function by adding time >> parameter. Regression is a functionality available for the Siddhi stream >> processor extension known as timeseries. In the current implementation, the >> regression function consumes two or more parameters and performs regression >> as follows. >> >> The mandatory parameters to be given are the dependent attribute Y and >> the independent attribute(s) X1, X2,....Xn. For performing simple linear >> regression, merely one independent attribute would be given. Two or more >> independent attributes are consumed for executing multiple linear >> regression. >> >> timeseries:regress(Y, X1, X2......,Xn) >> >> The other three optional parameters to be specified are calculation >> interval, batch size and confidence interval (ci). In the case where those >> are not specified, the default values would be assumed. >> >> timeseries:regress(calcInterval, batchSize, ci, Y, X1, X2......,Xn) >> >> Batch size works as a length window in this implementation, which allows >> one to restrict the number of events considered when executing regression >> in real time. For example, if length is 5, only the latest 5 events >> (current event and the 4 events prior to it) would be used for performing >> regression. >> >> *This suggested extension would allow the user to restrict the number of >> events based on a time window as well, apart from constraining based on >> length only. Therefore regression function would consume duration as an >> additional parameter, subsequent to the completion of my task. * >> >> *timeseries:regress(calcInterval, duration, batchSize, ci, Y, X1, >> X2......,Xn).* >> >> Here the parameter 'duration' would comprise of two parts, where the >> first part specifies the number and the second part specifies the unit >> (e.g. 2 sec, 5 mins, 7 days). On arrival of each event, the past events to >> be considered for performing regression would be based on this 'duration' >> (i.e. If a new event arrives at 10.00 a.m and the duration is 5 mins, only >> the events which arrived within the time period of 9.55 a.m to 10.00 a.m >> are considered for regression). >> >> Suggestions and comments are most welcome. >> >> Thank you. >> >> -- >> Charini Vimansha Nanayakkara >> Software Engineer at WSO2 >> Mobile: 0714126293 >> >> > > > -- > Charini Vimansha Nanayakkara > Software Engineer at WSO2 > Mobile: 0714126293 > > -- Tishan Dahanayakage Software Engineer WSO2, Inc. Mobile:+94 716481328 Disclaimer: This communication may contain privileged or other confidential information and is intended exclusively for the addressee/s. If you are not the intended recipient/s, or believe that you may have received this communication in error, please reply to the sender indicating that fact and delete the copy you received and in addition, you should not print, copy, re-transmit, disseminate, or otherwise use the information contained in this communication. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. The sender does not accept liability for any errors or omissions.
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
