I think having batchSize & duration will be good as this will limit the
number of events considered, this can help to improve performance as well.

Suho

On Thu, Jun 2, 2016 at 1:59 PM, Charini Nanayakkara <[email protected]>
wrote:

> Hi Tishan,
>
> For my requirement, having time window alone is adequate. So your point
> might be valid. However I'm concerned of the re-usability of the extension.
>
> @Srinath, WDYT? Which would be the better option? Having a single
> implementation or two different ones?
>
> Thanks
>
> On Thu, Jun 2, 2016 at 1:48 PM, Tishan Dahanayakage <[email protected]>
> wrote:
>
>> Charini,
>>
>> My knowledge on the on this domain is sparse. Hence I do not know whether
>> a scenario where time AND length is a valid business case. If it is a valid
>> business case +1 for the design including both parameters in same
>> implementation.
>>
>> Thanks
>> /Tishan
>>
>> On Thu, Jun 2, 2016 at 12:54 PM, Charini Nanayakkara <[email protected]>
>> wrote:
>>
>>> Hi Tishan,
>>>
>>> Yes. Assuming batch size is 5 and time window is 20 mins, only 5 out of
>>> 10 events which arrive within last 5 mins would be processed due to batch
>>> size constraint (even though all events must be processed if time alone was
>>> considered). Having separate implementations would work on the majority of
>>> the scenarios, since only time OR length is usually applicable but not
>>> both. However, having two implementations would cause trouble in the
>>> situations where both the time factor and length are important (equivalent
>>> to AND operation on the two constraints). If our requirement is to have
>>> only one of the two constraints, we can use a very large value for the
>>> other parameter (i.e. if we only need to limit number of events based on
>>> time = 1 sec constraint, we can specify 1,000,000 for batch size assuming
>>> we have prior knowledge that 1,000,000 events would never arrive within 1
>>> sec). IMHO neither of the two options (separate or single implementation)
>>> are perfect for every scenario. However having a single implementation
>>> would help address more cases as I understand. What's your opinion on this?
>>>
>>> Thanks
>>>
>>> On Thu, Jun 2, 2016 at 10:14 AM, Charini Nanayakkara <[email protected]>
>>> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I have planned to extend the existent Regression Function by adding
>>>> time parameter. Regression is a functionality available for the Siddhi
>>>> stream processor extension known as timeseries. In the current
>>>> implementation, the regression function consumes two or more parameters and
>>>> performs regression as follows.
>>>>
>>>> The mandatory parameters to be given are the dependent attribute Y and
>>>> the independent attribute(s) X1, X2,....Xn. For performing simple linear
>>>> regression, merely one independent attribute would be given. Two or more
>>>> independent attributes are consumed for executing multiple linear
>>>> regression.
>>>>
>>>> timeseries:regress(Y, X1, X2......,Xn)
>>>>
>>>> The other three optional parameters to be specified are calculation
>>>> interval, batch size and confidence interval (ci). In the case where those
>>>> are not specified, the default values would be assumed.
>>>>
>>>> timeseries:regress(calcInterval, batchSize, ci, Y, X1, X2......,Xn)
>>>>
>>>> Batch size works as a length window in this implementation, which
>>>> allows one to restrict the number of events considered when executing
>>>> regression in real time. For example, if length is 5, only the latest 5
>>>> events (current event and the 4 events prior to it) would be used for
>>>> performing regression.
>>>>
>>>> *This suggested extension would allow the user to restrict the number
>>>> of events based on a time window as well, apart from constraining based on
>>>> length only. Therefore regression function would consume duration as an
>>>> additional parameter, subsequent to the completion of my task. *
>>>>
>>>> *timeseries:regress(calcInterval, duration, batchSize, ci, Y, X1,
>>>> X2......,Xn).*
>>>>
>>>> Here the parameter 'duration' would comprise of two parts, where the
>>>> first part specifies the number and the second part specifies the unit
>>>> (e.g. 2 sec, 5 mins, 7 days). On arrival of each event, the past events to
>>>> be considered for performing regression would be based on this 'duration'
>>>> (i.e. If a new event arrives at 10.00 a.m and the duration is 5  mins, only
>>>> the events which arrived within the time period of 9.55 a.m to 10.00 a.m
>>>> are considered for regression).
>>>>
>>>> Suggestions and comments are most welcome.
>>>>
>>>> Thank you.
>>>>
>>>> --
>>>> Charini Vimansha Nanayakkara
>>>> Software Engineer at WSO2
>>>> Mobile: 0714126293
>>>>
>>>>
>>>
>>>
>>> --
>>> Charini Vimansha Nanayakkara
>>> Software Engineer at WSO2
>>> Mobile: 0714126293
>>>
>>>
>>
>>
>> --
>> Tishan Dahanayakage
>> Software Engineer
>> WSO2, Inc.
>> Mobile:+94 716481328
>>
>> Disclaimer: This communication may contain privileged or other
>> confidential information and is intended exclusively for the addressee/s.
>> If you are not the intended recipient/s, or believe that you may have
>> received this communication in error, please reply to the sender indicating
>> that fact and delete the copy you received and in addition, you should not
>> print, copy, re-transmit, disseminate, or otherwise use the information
>> contained in this communication. Internet communications cannot be
>> guaranteed to be timely, secure, error or virus-free. The sender does not
>> accept liability for any errors or omissions.
>>
>
>
>
> --
> Charini Vimansha Nanayakkara
> Software Engineer at WSO2
> Mobile: 0714126293
>
>


-- 

*S. Suhothayan*
Technical Lead & Team Lead of WSO2 Complex Event Processor
*WSO2 Inc. *http://wso2.com
* <http://wso2.com/>*
lean . enterprise . middleware


*cell: (+94) 779 756 757 | blog: http://suhothayan.blogspot.com/
<http://suhothayan.blogspot.com/>twitter: http://twitter.com/suhothayan
<http://twitter.com/suhothayan> | linked-in:
http://lk.linkedin.com/in/suhothayan <http://lk.linkedin.com/in/suhothayan>*
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to