I think having batchSize & duration will be good as this will limit the number of events considered, this can help to improve performance as well.
Suho On Thu, Jun 2, 2016 at 1:59 PM, Charini Nanayakkara <[email protected]> wrote: > Hi Tishan, > > For my requirement, having time window alone is adequate. So your point > might be valid. However I'm concerned of the re-usability of the extension. > > @Srinath, WDYT? Which would be the better option? Having a single > implementation or two different ones? > > Thanks > > On Thu, Jun 2, 2016 at 1:48 PM, Tishan Dahanayakage <[email protected]> > wrote: > >> Charini, >> >> My knowledge on the on this domain is sparse. Hence I do not know whether >> a scenario where time AND length is a valid business case. If it is a valid >> business case +1 for the design including both parameters in same >> implementation. >> >> Thanks >> /Tishan >> >> On Thu, Jun 2, 2016 at 12:54 PM, Charini Nanayakkara <[email protected]> >> wrote: >> >>> Hi Tishan, >>> >>> Yes. Assuming batch size is 5 and time window is 20 mins, only 5 out of >>> 10 events which arrive within last 5 mins would be processed due to batch >>> size constraint (even though all events must be processed if time alone was >>> considered). Having separate implementations would work on the majority of >>> the scenarios, since only time OR length is usually applicable but not >>> both. However, having two implementations would cause trouble in the >>> situations where both the time factor and length are important (equivalent >>> to AND operation on the two constraints). If our requirement is to have >>> only one of the two constraints, we can use a very large value for the >>> other parameter (i.e. if we only need to limit number of events based on >>> time = 1 sec constraint, we can specify 1,000,000 for batch size assuming >>> we have prior knowledge that 1,000,000 events would never arrive within 1 >>> sec). IMHO neither of the two options (separate or single implementation) >>> are perfect for every scenario. However having a single implementation >>> would help address more cases as I understand. What's your opinion on this? >>> >>> Thanks >>> >>> On Thu, Jun 2, 2016 at 10:14 AM, Charini Nanayakkara <[email protected]> >>> wrote: >>> >>>> Hi All, >>>> >>>> I have planned to extend the existent Regression Function by adding >>>> time parameter. Regression is a functionality available for the Siddhi >>>> stream processor extension known as timeseries. In the current >>>> implementation, the regression function consumes two or more parameters and >>>> performs regression as follows. >>>> >>>> The mandatory parameters to be given are the dependent attribute Y and >>>> the independent attribute(s) X1, X2,....Xn. For performing simple linear >>>> regression, merely one independent attribute would be given. Two or more >>>> independent attributes are consumed for executing multiple linear >>>> regression. >>>> >>>> timeseries:regress(Y, X1, X2......,Xn) >>>> >>>> The other three optional parameters to be specified are calculation >>>> interval, batch size and confidence interval (ci). In the case where those >>>> are not specified, the default values would be assumed. >>>> >>>> timeseries:regress(calcInterval, batchSize, ci, Y, X1, X2......,Xn) >>>> >>>> Batch size works as a length window in this implementation, which >>>> allows one to restrict the number of events considered when executing >>>> regression in real time. For example, if length is 5, only the latest 5 >>>> events (current event and the 4 events prior to it) would be used for >>>> performing regression. >>>> >>>> *This suggested extension would allow the user to restrict the number >>>> of events based on a time window as well, apart from constraining based on >>>> length only. Therefore regression function would consume duration as an >>>> additional parameter, subsequent to the completion of my task. * >>>> >>>> *timeseries:regress(calcInterval, duration, batchSize, ci, Y, X1, >>>> X2......,Xn).* >>>> >>>> Here the parameter 'duration' would comprise of two parts, where the >>>> first part specifies the number and the second part specifies the unit >>>> (e.g. 2 sec, 5 mins, 7 days). On arrival of each event, the past events to >>>> be considered for performing regression would be based on this 'duration' >>>> (i.e. If a new event arrives at 10.00 a.m and the duration is 5 mins, only >>>> the events which arrived within the time period of 9.55 a.m to 10.00 a.m >>>> are considered for regression). >>>> >>>> Suggestions and comments are most welcome. >>>> >>>> Thank you. >>>> >>>> -- >>>> Charini Vimansha Nanayakkara >>>> Software Engineer at WSO2 >>>> Mobile: 0714126293 >>>> >>>> >>> >>> >>> -- >>> Charini Vimansha Nanayakkara >>> Software Engineer at WSO2 >>> Mobile: 0714126293 >>> >>> >> >> >> -- >> Tishan Dahanayakage >> Software Engineer >> WSO2, Inc. >> Mobile:+94 716481328 >> >> Disclaimer: This communication may contain privileged or other >> confidential information and is intended exclusively for the addressee/s. >> If you are not the intended recipient/s, or believe that you may have >> received this communication in error, please reply to the sender indicating >> that fact and delete the copy you received and in addition, you should not >> print, copy, re-transmit, disseminate, or otherwise use the information >> contained in this communication. Internet communications cannot be >> guaranteed to be timely, secure, error or virus-free. The sender does not >> accept liability for any errors or omissions. >> > > > > -- > Charini Vimansha Nanayakkara > Software Engineer at WSO2 > Mobile: 0714126293 > > -- *S. Suhothayan* Technical Lead & Team Lead of WSO2 Complex Event Processor *WSO2 Inc. *http://wso2.com * <http://wso2.com/>* lean . enterprise . middleware *cell: (+94) 779 756 757 | blog: http://suhothayan.blogspot.com/ <http://suhothayan.blogspot.com/>twitter: http://twitter.com/suhothayan <http://twitter.com/suhothayan> | linked-in: http://lk.linkedin.com/in/suhothayan <http://lk.linkedin.com/in/suhothayan>*
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
