henrikingo commented on PR #96: URL: https://github.com/apache/otava/pull/96#issuecomment-3543878090
> > I think to generate a data set that the hunter paper was concerned with, you need the drop to be short, maybe even 1-2 only: > > ``` > > drop = 400 + np.random.randn(2) * 5 > > ``` > > Correct me If I'm wrong, but my understanding was that there are two separate problems: > > 1. Disappearing of previous found critical points. > 2. Not detecting the critical points in the first place (because the number of abnormal points is small) No, these are the same problem. The change points disappear when the interval/ window they are in, grows larger. I always assumed this was a feature: In a short timeseries, say 50-100 points, MongoDB e-divisive with typical parameters would ignore spikes that last a single point only, and might alert for a plateu of 2-3 points that then returns to the original level. (But even then would only produce 1 change point, because original MongoDB implementation needed a hard coded 3 points before it would alert anything at all, so it is not possible to find 2 neighboring change points. This is from the Matteson paper and their R reference implementation I believe defaulted to a leading 30 points or so. Which would be a long time to wait for a jira ticket if it was nightly builds!) ...where was I... So then if the series keeps growing , my interpretation is that the short lived change becomes less significant compared to the entire series, so eventually it is ignored by the algorithm, just as if it was a single point. Conversely, also a single point could trigger an alert if it was large enough. (At least assuming that the series on both of its sides aren't perfectly constant.) The fix of adding a window is based on the above understanding: it creates a situation where the local computation doesn't take into account more than a small number of local points. And this is why I asked earlier whether Kappa is now equivalent to observing a series grow from 1 point and computing the algorithm for every added point. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
