Hi Fergal and Mark,

Thanks very much for your pointers and insights.  I still would like to see if CLA can do better in discerning patterns and further detecting anomalies in these patterns.

Let’s look at the hot gym anomaly example again.  It is obvious that gym consumption is low during 2-4 am.  I grep every day 2AM energy consumption from rev-center-hourly.csv and put them into rec-2AM.csv (attached).  Then I plotted the 2AM consumption (attached graph).  As the plot shows, on vast majority of days, the energy consumption hovered around 5.  Some days it went up, but never dipped below 4.5.  One would reasonably assume 4.5 to 10 maybe even to 15 is the normal range.  On days consumption went up to >20 or went down <4.5 should be considered abnormal.

The CLA, however, didn’t give high anomaly score in either >20 or <4.5 case.  The highest anomaly score was merely 0.3 when consumption (~25) was 5 times the normal.  

Your suggested approach appears valid for the zero consumption case, but I am not sure how it deals with aforementioned scenario where the normal pattern is highly skewed.  Furthermore, the suggested approach assumes we know the normal range beforehand (hence to be able to create an encoder to deal with 0).  Imaging we were not dealing with one hot gym, but ten of thousands servers in a data center. Human knowing the normal range for each server during low activity hours is not practical.  So questions is can CLA learn the range and pick a suitable encoder automatically?


Regards,
Tom




Attachment: PastedGraphic-1.pdf
Description: Adobe PDF document



INFO:__main__:Anomaly detected at [2010-10-13 02:00:00]. Anomaly score: 0.300000


7/2/10 2:00,4.7
7/3/10 2:00,6
7/4/10 2:00,5
7/5/10 2:00,4.7
7/6/10 2:00,4.7
7/7/10 2:00,12
7/8/10 2:00,4.8
7/9/10 2:00,4.7
7/10/10 2:00,4.7
7/11/10 2:00,4.8
7/12/10 2:00,4.8
7/13/10 2:00,4.8
7/14/10 2:00,4.8
7/15/10 2:00,4.7
7/16/10 2:00,5.2
7/17/10 2:00,4.7
7/18/10 2:00,4.7
7/19/10 2:00,4.9
7/20/10 2:00,4.9
7/21/10 2:00,15.7
7/22/10 2:00,4.8
7/23/10 2:00,4.7
7/24/10 2:00,4.7
7/25/10 2:00,4.7
7/26/10 2:00,4.9
7/27/10 2:00,4.8
7/28/10 2:00,4.8
7/29/10 2:00,4.8
7/30/10 2:00,4.9
7/31/10 2:00,4.9
8/1/10 2:00,4.8
8/2/10 2:00,4.7
8/3/10 2:00,5.3
8/4/10 2:00,14.7
8/5/10 2:00,4.9
8/6/10 2:00,20.5
8/7/10 2:00,4.7
8/8/10 2:00,4.9
8/9/10 2:00,4.7
8/10/10 2:00,4.9
8/11/10 2:00,4.8
8/12/10 2:00,4.8
8/13/10 2:00,4.8
8/14/10 2:00,4.9
8/15/10 2:00,4.8
8/16/10 2:00,4.7
8/17/10 2:00,4.9
8/18/10 2:00,5.6
8/19/10 2:00,4.9
8/20/10 2:00,4.8
8/21/10 2:00,11
8/22/10 2:00,5
8/23/10 2:00,5.1
8/24/10 2:00,5.4
8/25/10 2:00,5
8/26/10 2:00,7
8/27/10 2:00,5
8/28/10 2:00,5.1
8/29/10 2:00,5
8/30/10 2:00,5.1
8/31/10 2:00,5.1
9/1/10 2:00,5.1
9/2/10 2:00,5.3
9/3/10 2:00,4.9
9/4/10 2:00,4.9
9/5/10 2:00,17.6
9/6/10 2:00,4.8
9/7/10 2:00,4.7
9/8/10 2:00,4.8
9/9/10 2:00,5.1
9/10/10 2:00,4.9
9/11/10 2:00,11.4
9/12/10 2:00,4.8
9/13/10 2:00,4.9
9/14/10 2:00,4.8
9/15/10 2:00,5
9/16/10 2:00,7.5
9/17/10 2:00,7.8
9/18/10 2:00,16.8
9/19/10 2:00,7.8
9/20/10 2:00,7.8
9/21/10 2:00,13
9/22/10 2:00,7.9
9/23/10 2:00,8
9/24/10 2:00,7.6
9/25/10 2:00,12.5
9/26/10 2:00,7.9
9/27/10 2:00,8
9/28/10 2:00,9
9/29/10 2:00,7.7
9/30/10 2:00,12.7
10/1/10 2:00,7.7
10/2/10 2:00,14.3
10/3/10 2:00,7.7
10/4/10 2:00,7.8
10/5/10 2:00,7.6
10/6/10 2:00,8
10/7/10 2:00,8.1
10/8/10 2:00,7.8
10/9/10 2:00,7.7
10/10/10 2:00,21.8
10/11/10 2:00,8
10/12/10 2:00,7.8
10/13/10 2:00,24.2
10/14/10 2:00,8.3
10/15/10 2:00,20.3
10/16/10 2:00,4.8
10/17/10 2:00,4.8
10/18/10 2:00,16.1
10/19/10 2:00,5
10/20/10 2:00,5
10/21/10 2:00,4.9
10/22/10 2:00,4.8
10/23/10 2:00,4.9
10/24/10 2:00,4.9
10/25/10 2:00,4.9
10/26/10 2:00,4.7
10/27/10 2:00,5
10/28/10 2:00,4.9
10/29/10 2:00,4.8
10/30/10 2:00,4.9
10/31/10 2:00,4.9
11/1/10 2:00,5.1
11/2/10 2:00,4.8
11/3/10 2:00,5
11/4/10 2:00,4.8
11/5/10 2:00,4.8
11/6/10 2:00,4.9
11/7/10 2:00,4.8
11/8/10 2:00,4.9
11/9/10 2:00,4.9
11/10/10 2:00,5
11/11/10 2:00,4.9
11/12/10 2:00,5.1
11/13/10 2:00,4.9
11/14/10 2:00,4.9
11/15/10 2:00,5.1
11/16/10 2:00,5
11/17/10 2:00,5
11/18/10 2:00,4.8
11/19/10 2:00,4.8
11/20/10 2:00,4.9
11/21/10 2:00,5.7
11/22/10 2:00,5
11/23/10 2:00,4.8
11/24/10 2:00,5
11/25/10 2:00,4.8
11/26/10 2:00,5.1
11/27/10 2:00,5
11/28/10 2:00,5
11/29/10 2:00,20.5
11/30/10 2:00,4.9
12/1/10 2:00,4.9
12/2/10 2:00,4.8
12/3/10 2:00,4.8
12/4/10 2:00,14
12/5/10 2:00,4.9
12/6/10 2:00,20.8
12/7/10 2:00,5
12/8/10 2:00,5
12/9/10 2:00,5
12/10/10 2:00,4.9
12/11/10 2:00,13.4
12/12/10 2:00,4.9
12/13/10 2:00,5.1
12/14/10 2:00,5.1
12/15/10 2:00,4.9
12/16/10 2:00,5.1
12/17/10 2:00,4.8
12/18/10 2:00,4.9
12/19/10 2:00,4.8
12/20/10 2:00,5.1
12/21/10 2:00,4.8
12/22/10 2:00,4.9
12/23/10 2:00,5
12/24/10 2:00,4.9
12/25/10 2:00,5.1
12/26/10 2:00,5
12/27/10 2:00,4.9
12/28/10 2:00,4.8
12/29/10 2:00,4.9
12/30/10 2:00,5
12/31/10 2:00,5

On Dec 10, 2013, at 10:53 PM, Tom Tan <[email protected]> wrote:

Hi,

I am very intrigued by Nupic CLA and its potentials.  I was trying to use CLA algorithm to perform anomaly detection.  My data set is similar to that of the hotgym example - the usage is high during the day/business hours and low, but never zero, during night/non-business hours (sorry I can’t share my data set).  The zero usage means outage and should be considered as an anomaly regardless when it happens.  The problem is CLA failed to raise anomaly score when outage/zero usage happening during the non-business hours.

I managed to reproduce the problem using the the hot gym anomaly example. 

I made following change to "extra/hotgym/rec-center-hourly.csv"

4373,4374c4373,4374
< 12/31/10 1:00,0
< 12/31/10 2:00,0
---
> 12/31/10 1:00,4.9
> 12/31/10 2:00,5

that means zero energy usage during the 1 & 2 AM, which should be abnormal.  And corresponding CLA score are 0 (shown below)

INFO:__main__:Anomaly detected at [2010-12-31 01:00:00]. Anomaly score: 0.000000.
INFO:__main__:Anomaly detected at [2010-12-31 02:00:00]. Anomaly score: 0.000000.

When I used 24 “traditional” statistical models, each for an hour of the day, I was able to detect zero usage and report as an anomaly.  CLA doesn’t appear to be superior in this case. 

Can CLA model be tuned to account for scenarios like this?

Regards,
Tom


_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Reply via email to