Hi Tishan,

For my requirement, having time window alone is adequate. So your point
might be valid. However I'm concerned of the re-usability of the extension.

@Srinath, WDYT? Which would be the better option? Having a single
implementation or two different ones?

Thanks

On Thu, Jun 2, 2016 at 1:48 PM, Tishan Dahanayakage <[email protected]> wrote:

> Charini,
>
> My knowledge on the on this domain is sparse. Hence I do not know whether
> a scenario where time AND length is a valid business case. If it is a valid
> business case +1 for the design including both parameters in same
> implementation.
>
> Thanks
> /Tishan
>
> On Thu, Jun 2, 2016 at 12:54 PM, Charini Nanayakkara <[email protected]>
> wrote:
>
>> Hi Tishan,
>>
>> Yes. Assuming batch size is 5 and time window is 20 mins, only 5 out of
>> 10 events which arrive within last 5 mins would be processed due to batch
>> size constraint (even though all events must be processed if time alone was
>> considered). Having separate implementations would work on the majority of
>> the scenarios, since only time OR length is usually applicable but not
>> both. However, having two implementations would cause trouble in the
>> situations where both the time factor and length are important (equivalent
>> to AND operation on the two constraints). If our requirement is to have
>> only one of the two constraints, we can use a very large value for the
>> other parameter (i.e. if we only need to limit number of events based on
>> time = 1 sec constraint, we can specify 1,000,000 for batch size assuming
>> we have prior knowledge that 1,000,000 events would never arrive within 1
>> sec). IMHO neither of the two options (separate or single implementation)
>> are perfect for every scenario. However having a single implementation
>> would help address more cases as I understand. What's your opinion on this?
>>
>> Thanks
>>
>> On Thu, Jun 2, 2016 at 10:14 AM, Charini Nanayakkara <[email protected]>
>> wrote:
>>
>>> Hi All,
>>>
>>> I have planned to extend the existent Regression Function by adding time
>>> parameter. Regression is a functionality available for the Siddhi stream
>>> processor extension known as timeseries. In the current implementation, the
>>> regression function consumes two or more parameters and performs regression
>>> as follows.
>>>
>>> The mandatory parameters to be given are the dependent attribute Y and
>>> the independent attribute(s) X1, X2,....Xn. For performing simple linear
>>> regression, merely one independent attribute would be given. Two or more
>>> independent attributes are consumed for executing multiple linear
>>> regression.
>>>
>>> timeseries:regress(Y, X1, X2......,Xn)
>>>
>>> The other three optional parameters to be specified are calculation
>>> interval, batch size and confidence interval (ci). In the case where those
>>> are not specified, the default values would be assumed.
>>>
>>> timeseries:regress(calcInterval, batchSize, ci, Y, X1, X2......,Xn)
>>>
>>> Batch size works as a length window in this implementation, which allows
>>> one to restrict the number of events considered when executing regression
>>> in real time. For example, if length is 5, only the latest 5 events
>>> (current event and the 4 events prior to it) would be used for performing
>>> regression.
>>>
>>> *This suggested extension would allow the user to restrict the number of
>>> events based on a time window as well, apart from constraining based on
>>> length only. Therefore regression function would consume duration as an
>>> additional parameter, subsequent to the completion of my task. *
>>>
>>> *timeseries:regress(calcInterval, duration, batchSize, ci, Y, X1,
>>> X2......,Xn).*
>>>
>>> Here the parameter 'duration' would comprise of two parts, where the
>>> first part specifies the number and the second part specifies the unit
>>> (e.g. 2 sec, 5 mins, 7 days). On arrival of each event, the past events to
>>> be considered for performing regression would be based on this 'duration'
>>> (i.e. If a new event arrives at 10.00 a.m and the duration is 5  mins, only
>>> the events which arrived within the time period of 9.55 a.m to 10.00 a.m
>>> are considered for regression).
>>>
>>> Suggestions and comments are most welcome.
>>>
>>> Thank you.
>>>
>>> --
>>> Charini Vimansha Nanayakkara
>>> Software Engineer at WSO2
>>> Mobile: 0714126293
>>>
>>>
>>
>>
>> --
>> Charini Vimansha Nanayakkara
>> Software Engineer at WSO2
>> Mobile: 0714126293
>>
>>
>
>
> --
> Tishan Dahanayakage
> Software Engineer
> WSO2, Inc.
> Mobile:+94 716481328
>
> Disclaimer: This communication may contain privileged or other
> confidential information and is intended exclusively for the addressee/s.
> If you are not the intended recipient/s, or believe that you may have
> received this communication in error, please reply to the sender indicating
> that fact and delete the copy you received and in addition, you should not
> print, copy, re-transmit, disseminate, or otherwise use the information
> contained in this communication. Internet communications cannot be
> guaranteed to be timely, secure, error or virus-free. The sender does not
> accept liability for any errors or omissions.
>



-- 
Charini Vimansha Nanayakkara
Software Engineer at WSO2
Mobile: 0714126293
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to