Hi all,

I have 2 comments.

a. The timeseries extension to CEP which supports uni-variate and
multi-variate linear regression [1] can be used for this. We can use the
multi-variate regression to solve the curve fitting stated in Lahiru's
email. Basically what we need to do is use *t *and *t^2* as x1 and x2.
There by if we run linear regression we get  a,b,c such that V=a+b*t+c*t^2.
As Lasantha has mentioned we do have a forecasting facility as well, but
currently it only works for uni-variate regression, which is not the case
here. But if you really need it I might be able to extend it for this
use-case, for the moment. You can still use the existing regression
facility to determine the coefficients and do the forecasting yourself
(which is just plugging those values in to the above equation, with the
relevant t values.
Let me also just mention, that even though the function is 'linear'
regression, we can use linear regression to fit polynomial curves as long
as we know the degree of the polynomial function (which in this case we do).

b. Can't we also consider using exponentially weighted moving averages for
the previous approach. So instead of using average gradient and average
second derivative we can use 'decaying windows' in CEP and get the
exponentially weighted moving average of the gradient and second
derivative. This will eliminate the spawning of new instances due to sudden
'spikes' as we can control the decaying factor such that we give a
practically acceptable weightage to the most recent events compared to
older events.


Seshika

1. https://docs.wso2.com/display/CEP400/Regression

On Tue, Nov 11, 2014 at 8:51 PM, Lasantha Fernando <[email protected]>
wrote:

> Hi Lahiru,
>
> Would it be possible to use linear regression already available as
> Siddhi extensions in [1] or maybe improve on that existing extensions
> to extend it to fit polynomial curves? The code is available here [2].
>
> I think forecasting is also available which can be useful in this
> usecase. WDYT? Just sharing my 2 cents.. :-)
>
> [1] http://mail.wso2.org/mailarchive/architecture/2014-March/015696.html
> [2]
> https://github.com/wso2-dev/siddhi/tree/master/modules/siddhi-extensions
>
> Thanks,
> Lasantha
>
> On Tue, Nov 11, 2014 at 3:58 PM, Lahiru Sandaruwan <[email protected]>
> wrote:
> > Hi all,
> >
> > This contains the content i already sent to Stratos dev. Idea is to
> > highlight and separate the new improvement.
> >
> > Current implementation
> >
> > Currently CEP calculates average, gradient, and second derivative and
> send
> > those values to Autoscaler. Then Autoscaler predicts the values using S =
> > u*t + 0.5*a*t*t.
> >
> > In this method CEP calculation is not very much accurate as it does not
> > consider all the events when calculating the gradient and second
> derivative.
> > Therefore the equation we apply doesn't yield the best prediction.
> >
> > Proposed Implementation
> >
> > CEP's task
> >
> > I think best approach is to do "curve fitting"[1] for received event
> sample
> > in a particular time window. Refer "Locally weighted linear regression"
> > section at [2] for more details.
> >
> > We would need a second degree polynomial fitter for this, where we can
> use
> > Apache commons math library for this. Refer the sample at [3], we can run
> > this with any degree. e.g. 2, 3. Just increase the degree to increase the
> > accuracy.
> >
> > E.g.
> > So if get degree 2 polynomial fitter, we will have an equation like below
> > where value(v) is our statistic value and time(t) is the time of event.
> >
> > Equation we get from received events,
> > v = a*t*t + b*t + c
> >
> > So the solution is,
> >
> > Find memberwise curves that fits events received in specific window(say
> 10
> > minutes) at CEP
> > Send the parameters of fitted line(a, b, and c in above equation) with
> the
> > timestamp of last event(T) in the window, to Autoscaler
> >
> > Autoscaler's task
> >
> > Autoscaler use v = a*t*t + b*t + c function to predict the value in any
> > timestamp from the last timestamp
> >
> > E.g. Say we need to find the value(v) after 1 minute(assuming we carried
> all
> > the calculations in milliseconds),
> >
> > v = a * (T+60000) * (T+60000) + b * (T+60000) + c
> >
> > So we have memberwise predictions and we can find clusterwise prediction
> by
> > averaging all the memberwise values.
> >
> >
> > Please send your thoughts.
> >
> > Thanks.
> >
> > [1] http://en.wikipedia.org/wiki/Curve_fitting
> > [2] http://cs229.stanford.edu/notes/cs229-notes1.pdf
> > [3] http://commons.apache.org/proper/commons-math/userguide/fitting.html
> >
> >
> > --
> > --
> > Lahiru Sandaruwan
> > Committer and PMC member, Apache Stratos,
> > Senior Software Engineer,
> > WSO2 Inc., http://wso2.com
> > lean.enterprise.middleware
> >
> > email: [email protected] blog: http://lahiruwrites.blogspot.com/
> > linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146
> >
>

Reply via email to