Thanks Daniel, that is a good explanation. The longer version can be
seen here:
http://numenta.com/learn/science-of-anomaly-detection.html
And you probably want to understand the Temporal Memory algorithm as
well since the anomaly score is built on top of that.
On Tue, Nov 3, 2015 at 12:56 PM, Daniel McDonald
<[email protected] <mailto:[email protected]>> wrote:
There's one aspect of this thread that I don't feel has been touched
on, which may help in understanding prediction and the anomaly
score. I learned this at the spring hackathon in NY.
If you look at how the anomaly score is implemented [1], you'll see
that it computes the ratio of the difference of the number of active
columns and the number of active columns which were also predicted
to the number of actives. That is, (#active - #activeAndPredicted)
/ #active. Note that this formula does not depend on the total
number of predicted columns. In fact, if the HTM predicts all
columns, the anomaly score will be 0 for any subsequent input. In
this case, the HTM would be completely uncertain about the next step
in the sequence, so it predicts a superposition of all possible
patterns; therefore, any subsequent input is not anomalous.
That is exactly what happened to my Market Patterns hack at the
hackathon. After training the HTM on years of stock market data,
the anomaly score dropped quite low; however, when I looked
carefully at what was going on, the HTM had, in fact, saturated and
was predicting more than half of the columns to be active at each
step in the sequence. In effect, it was saying that the sequences
were unpredictable and anything was possible in the next step (we
already knew that about the stock market, right?). Consequently,
whatever happened next was not anomalous.
When I look at your example data, I read it this way:
At 175, 0.0 was read and 0.0 is the prediction for the next step.
The anomaly score of 0.325 is meaningless, because we don't have
data from the previous step.
At 176, 62 was read, which doesn't match the prediction of 0.0 (from
175), so it is anomalous (0.65). 52 is predicted for the next step.
At 177, 402 is read. It is completely anomalous (1.0). That is
there is no overlap in the columns predicted for the value 52 and
the columns active for the value 402. If you are using a scalar
encoder, that makes sense, since the bit patterns for such different
numbers likely have no overlap in the encoding or in the SDR
produced by the SP. 0.0 is predicted for the next step.
At 178, 0 is read, and the anomaly score drops low (0.125), since
the actual matches closely to what was predicted at the previous
step. The score isn't exactly 0, because the predicted SDR from the
previous step and the encoded SDR for the new input may differ in
some columns. In other words, in the previous step, when 0.0 was
reported as the prediction, this was only an approximate translation
of a predicted SDR, where 0.0 was the closest decoded
representation. 0.0 is predicted for the next step.
At 179, 402 is read, which is completely anomalous (1.0) because the
predicted SDR for 0.0 had no column in common with the encoded SDR
for 402.
180 is similar to 178, and 0.0 is predicted.
At 181, 3 is read. The anomaly score is low (0.05), because the
scalar encoder produces overlapping patterns for similar numbers, so
there is likely overlap in the SDR's for 0 and 3. 402 is predicted.
At 182, 50 is read. The anomaly score is low (0.1), which is a bit
puzzling; however, it may be due to saturation. The prediction of
402 could represent a case where many columns were predicted
representing a superposition of possible states, and 402 was just
the strongest one (i.e., had the highest overlap of the encoded SDR
for 402 with the predicted columns). That is, 52 may have also been
predicted, but to a lesser degree than 402. It may be helpful to
look at how many columns are predicted vs. active in each step to
see when this happens. If the number of predicted columns suddenly
jumps, it means that the HTM is uncertain about the next step (or,
that it sees many possible next steps given the current context).
[1]
https://github.com/numenta/nupic/blob/master/src/nupic/algorithms/anomaly.py
Best regards,
Daniel
On Tue, Nov 3, 2015 at 4:23 AM, Wakan Tanka <[email protected]
<mailto:[email protected]>> wrote:
Hello NuPIC,
Here
http://lists.numenta.org/pipermail/nupic_lists.numenta.org/2015-November/012139.html
is discussion about correct interpretation of NuPIC output which
I would like to extend. First I will provide short summary and
then ask another question.
Consider following output:
step,original,prediction,anomaly score
175,0,0.0,0.32500000000000001
176,62,52.0,0.65000000000000002
177,402,0.0,1.0
178,0,0.0,0.125
179,402,0.0,1.0
180,0,0.0,0.0
181,3,402.0,0.050000000000000003
182,50,52.0,0.10000000000000001
183,68,13.0,0.90000000000000002
This is output of one step ahead prediction without using
inference shifter. It basically mean that the prediction made at
step N is for step N+1. Or in another words if the prediction is
perfectly right then prediction value at step N should
correspond to the original value at step N+1.
Anomaly score can be viewed as confidence of prediction. For
example, NuPIC might only be 23% confident in the best
prediction it gives, in which case the anomaly score could be
very high. This is the case of step 179 where prediction is 0
and the original value on step 180 is 0. Note that anomaly score
on step 179 is 1.0. It means that NuPIC was not confident in
prediction, despite that the prediction was correct.
Opposite situation happens on step 180 where prediction is 0 and
the original value on step 181 is 3. Note that anomaly score on
step 180 is 0. That means that NuPIC was quiet confident in
prediction but it was not correct.
Questions:
1. Does anomaly score on given line also counts with the
original value on given line? For example anomaly score on this line
181,3,402.0,0.050000000000000003
take into account that 3 is the original value? Or it is
computed without respect to this value?
2. Is it possible to compute some kind of debug information
reading prediction and anomaly score? I mean something like this
from NuPIC perspective:
I'm 23% sure that next value will be 10
I'm 27% sure that next value will be 20
I'm 50% sure that next value will be 30
3. Is OK to predict data for zero steps forward if I'm just
interested in the prediction accuracy?
4. Does NuPIC make some kind of look back? I mean if NuPIC was
at step 180 confident that next value will be 0 but later it
shows that it was mistake does NuPIC somehow recount the anomaly
score from step 180 for further data processing? Or this is done
automatically in HTM?
PS: I've cross posted this question on SO to reach more people
here
http://stackoverflow.com/questions/33495388/how-to-correctly-interpret-nupic-output-vol-2
No virus found in this message.
Checked by AVG - www.avg.com <http://www.avg.com>
Version: 2016.0.7163 / Virus Database: 4457/10945 - Release Date: 11/04/15