Nick, yes the combined anomaly json file contains anomaly windows, which define 
the periods within which detections are true positives. The file is a 
dictionary where the keys and values are the file names and lists of windows, 
respectively. Each window is specified by two timestamps in a list.

You are correct that window sizes are calculated with 
0.1*data_size/numOfAnomalies. I would recommend against defining anomaly 
windows in the method you described, for two main reasons:

  1.  The NAB scoring function relies on the fact that a given anomaly starts 
precisely at the center of the window. It is a scaled sigmoid, where true 
positives early in the window score higher than those later; we can assign 
appropriate values to earlier/later detections.
  2.  Merely checking windows for the existence of an anomaly, as in your 
method, ignores the value of making detections as early as possible; you may as 
well count the total true/false positives/negatives. Scoring in this way tells 
us very little about the performance of an algorithm as it attempts to detect 
anomalies in real-time.

Best,
Alex

Alexander Lavin
Software Engineer
Numenta

Reply via email to