Hi Tishan, For my requirement, having time window alone is adequate. So your point might be valid. However I'm concerned of the re-usability of the extension.
@Srinath, WDYT? Which would be the better option? Having a single implementation or two different ones? Thanks On Thu, Jun 2, 2016 at 1:48 PM, Tishan Dahanayakage <[email protected]> wrote: > Charini, > > My knowledge on the on this domain is sparse. Hence I do not know whether > a scenario where time AND length is a valid business case. If it is a valid > business case +1 for the design including both parameters in same > implementation. > > Thanks > /Tishan > > On Thu, Jun 2, 2016 at 12:54 PM, Charini Nanayakkara <[email protected]> > wrote: > >> Hi Tishan, >> >> Yes. Assuming batch size is 5 and time window is 20 mins, only 5 out of >> 10 events which arrive within last 5 mins would be processed due to batch >> size constraint (even though all events must be processed if time alone was >> considered). Having separate implementations would work on the majority of >> the scenarios, since only time OR length is usually applicable but not >> both. However, having two implementations would cause trouble in the >> situations where both the time factor and length are important (equivalent >> to AND operation on the two constraints). If our requirement is to have >> only one of the two constraints, we can use a very large value for the >> other parameter (i.e. if we only need to limit number of events based on >> time = 1 sec constraint, we can specify 1,000,000 for batch size assuming >> we have prior knowledge that 1,000,000 events would never arrive within 1 >> sec). IMHO neither of the two options (separate or single implementation) >> are perfect for every scenario. However having a single implementation >> would help address more cases as I understand. What's your opinion on this? >> >> Thanks >> >> On Thu, Jun 2, 2016 at 10:14 AM, Charini Nanayakkara <[email protected]> >> wrote: >> >>> Hi All, >>> >>> I have planned to extend the existent Regression Function by adding time >>> parameter. Regression is a functionality available for the Siddhi stream >>> processor extension known as timeseries. In the current implementation, the >>> regression function consumes two or more parameters and performs regression >>> as follows. >>> >>> The mandatory parameters to be given are the dependent attribute Y and >>> the independent attribute(s) X1, X2,....Xn. For performing simple linear >>> regression, merely one independent attribute would be given. Two or more >>> independent attributes are consumed for executing multiple linear >>> regression. >>> >>> timeseries:regress(Y, X1, X2......,Xn) >>> >>> The other three optional parameters to be specified are calculation >>> interval, batch size and confidence interval (ci). In the case where those >>> are not specified, the default values would be assumed. >>> >>> timeseries:regress(calcInterval, batchSize, ci, Y, X1, X2......,Xn) >>> >>> Batch size works as a length window in this implementation, which allows >>> one to restrict the number of events considered when executing regression >>> in real time. For example, if length is 5, only the latest 5 events >>> (current event and the 4 events prior to it) would be used for performing >>> regression. >>> >>> *This suggested extension would allow the user to restrict the number of >>> events based on a time window as well, apart from constraining based on >>> length only. Therefore regression function would consume duration as an >>> additional parameter, subsequent to the completion of my task. * >>> >>> *timeseries:regress(calcInterval, duration, batchSize, ci, Y, X1, >>> X2......,Xn).* >>> >>> Here the parameter 'duration' would comprise of two parts, where the >>> first part specifies the number and the second part specifies the unit >>> (e.g. 2 sec, 5 mins, 7 days). On arrival of each event, the past events to >>> be considered for performing regression would be based on this 'duration' >>> (i.e. If a new event arrives at 10.00 a.m and the duration is 5 mins, only >>> the events which arrived within the time period of 9.55 a.m to 10.00 a.m >>> are considered for regression). >>> >>> Suggestions and comments are most welcome. >>> >>> Thank you. >>> >>> -- >>> Charini Vimansha Nanayakkara >>> Software Engineer at WSO2 >>> Mobile: 0714126293 >>> >>> >> >> >> -- >> Charini Vimansha Nanayakkara >> Software Engineer at WSO2 >> Mobile: 0714126293 >> >> > > > -- > Tishan Dahanayakage > Software Engineer > WSO2, Inc. > Mobile:+94 716481328 > > Disclaimer: This communication may contain privileged or other > confidential information and is intended exclusively for the addressee/s. > If you are not the intended recipient/s, or believe that you may have > received this communication in error, please reply to the sender indicating > that fact and delete the copy you received and in addition, you should not > print, copy, re-transmit, disseminate, or otherwise use the information > contained in this communication. Internet communications cannot be > guaranteed to be timely, secure, error or virus-free. The sender does not > accept liability for any errors or omissions. > -- Charini Vimansha Nanayakkara Software Engineer at WSO2 Mobile: 0714126293
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
