Re: questions about anomaly detection with NuPIC

Alex Lavin Wed, 14 Oct 2015 10:47:43 -0700

Hi Mark,
I'd like to point you to NAB [1], our benchmark for anomaly detection in 
streaming data. Included in the corpus are 17 data files representing a variety 
of server metrics, where we specifically selected these files for NAB because 
they test detectors for the problems you described.

I've plotted a few examples you may be interested in [2-4], where the red dots
represent the starting point of true anomalies, and the diamonds mark
detections by the HTM anomaly detection algorithm (green and red are true and
false positives, respectively).

On your previous questions...
- We typically say HTM needs 1000 data instances to sufficiently learn the
temporal patterns such that it can start reliably making predictions (and
anomaly detections). You'll notice the anomaly scores are relatively high at
the beginning of a data stream, but settle down after HTM has learned the
sequences well.
- A very noisy stream will result in FP detections, but this is true of any
anomaly detection algorithm. To decrease the number of false positives, you can
increase the threshold on the anomaly likelihood. That is, fewer data points
will be flagged as anomalous, but this may come at the cost of an increase in
false negatives.
- The temporal memory has a large capacity for storing patterns of sequences,
so this depends on what you mean by "prolonged use". The anomaly likelihood
estimation uses several parameters [5] related to how much previous data is
used to reestimate the distribution, but tweaking these generally has little
effect on the resulting detections.

[1] https://github.com/numenta/NAB
[2]
https://plot.ly/~alavin/3151/anomaly-detections-for-realawscloudwatchec2-cpu-utilization-5f5533csv/
[3]
https://plot.ly/~alavin/3187/anomaly-detections-for-realawscloudwatchelb-request-count-8c0756csv/
[4]
https://plot.ly/~alavin/3199/anomaly-detections-for-realawscloudwatchrds-cpu-utilization-e47b3bcsv/
[5]
https://github.com/numenta/nupic/blob/master/src/nupic/algorithms/anomaly_likelihood.py#L84-106

Cheers,
Alex

Re: questions about anomaly detection with NuPIC

Reply via email to