Hi Tom,

If you use the standard (uniform) ScalarEncoder, zero will look very like
the values close to zero. This is because the default is to treat two
scalars as semantically similar or even identical if they are very close
together on the number line. This is indeed true for your data in the
normal operating range.

In order to treat zero as an anomaly, you need to reserve several bits for
it (possibly up to w bits) in the encoder you use (this would involve using
your own custom encoder, or making changes to the standard one). If your
data contains no zeroes under normal circumstances, the CLA will not learn
to predict zeroes, and a zero will lead to anomalies.

You would need to ensure that the region can recognise zeroes, so train it
with TP turned off on data which includes zeroes (using your custom
encoder). The SP will now work for all values of your data. Once trained,
you turn on TP and set it to detect anomalies.

Regards,

Fergal Byrne


On Wed, Dec 11, 2013 at 6:53 AM, Tom Tan <[email protected]> wrote:

> Hi,
>
> I am very intrigued by Nupic CLA and its potentials.  I was trying to use
> CLA algorithm to perform anomaly detection.  My data set is similar to that
> of the hotgym example - the usage is high during the day/business hours and
> low, but never zero, during night/non-business hours (sorry I can’t share
> my data set).  The zero usage means outage and should be considered as an
> anomaly regardless when it happens.  The problem is CLA failed to raise
> anomaly score when outage/zero usage happening during the non-business
> hours.
>
> I managed to reproduce the problem using the the hot gym anomaly example.
>
> I made following change to "extra/hotgym/rec-center-hourly.csv"
>
> 4373,4374c4373,4374
> < 12/31/10 1:00,0
> < 12/31/10 2:00,0
> ---
> > 12/31/10 1:00,4.9
> > 12/31/10 2:00,5
>
> that means zero energy usage during the 1 & 2 AM, which should be
> abnormal.  And corresponding CLA score are 0 (shown below)
>
> INFO:__main__:Anomaly detected at [2010-12-31 01:00:00]. Anomaly score:
> 0.000000.
> INFO:__main__:Anomaly detected at [2010-12-31 02:00:00]. Anomaly score:
> 0.000000.
>
> When I used 24 “traditional” statistical models, each for an hour of the
> day, I was able to detect zero usage and report as an anomaly.  CLA doesn’t
> appear to be superior in this case.
>
> Can CLA model be tuned to account for scenarios like this?
>
> Regards,
> Tom
>
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>


-- 

Fergal Byrne, Brenter IT

<http://www.examsupport.ie>http://inbits.com - Better Living through
Thoughtful Technology

e:[email protected] t:+353 83 4214179
Formerly of Adnet [email protected] http://www.adnet.ie
_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Reply via email to