There's one aspect of this thread that I don't feel has been touched on, which may help in understanding prediction and the anomaly score. I learned this at the spring hackathon in NY.
If you look at how the anomaly score is implemented [1], you'll see that it computes the ratio of the difference of the number of active columns and the number of active columns which were also predicted to the number of actives. That is, (#active - #activeAndPredicted) / #active. Note that this formula does not depend on the total number of predicted columns. In fact, if the HTM predicts all columns, the anomaly score will be 0 for any subsequent input. In this case, the HTM would be completely uncertain about the next step in the sequence, so it predicts a superposition of all possible patterns; therefore, any subsequent input is not anomalous. That is exactly what happened to my Market Patterns hack at the hackathon. After training the HTM on years of stock market data, the anomaly score dropped quite low; however, when I looked carefully at what was going on, the HTM had, in fact, saturated and was predicting more than half of the columns to be active at each step in the sequence. In effect, it was saying that the sequences were unpredictable and anything was possible in the next step (we already knew that about the stock market, right?). Consequently, whatever happened next was not anomalous. When I look at your example data, I read it this way: At 175, 0.0 was read and 0.0 is the prediction for the next step. The anomaly score of 0.325 is meaningless, because we don't have data from the previous step. At 176, 62 was read, which doesn't match the prediction of 0.0 (from 175), so it is anomalous (0.65). 52 is predicted for the next step. At 177, 402 is read. It is completely anomalous (1.0). That is there is no overlap in the columns predicted for the value 52 and the columns active for the value 402. If you are using a scalar encoder, that makes sense, since the bit patterns for such different numbers likely have no overlap in the encoding or in the SDR produced by the SP. 0.0 is predicted for the next step. At 178, 0 is read, and the anomaly score drops low (0.125), since the actual matches closely to what was predicted at the previous step. The score isn't exactly 0, because the predicted SDR from the previous step and the encoded SDR for the new input may differ in some columns. In other words, in the previous step, when 0.0 was reported as the prediction, this was only an approximate translation of a predicted SDR, where 0.0 was the closest decoded representation. 0.0 is predicted for the next step. At 179, 402 is read, which is completely anomalous (1.0) because the predicted SDR for 0.0 had no column in common with the encoded SDR for 402. 180 is similar to 178, and 0.0 is predicted. At 181, 3 is read. The anomaly score is low (0.05), because the scalar encoder produces overlapping patterns for similar numbers, so there is likely overlap in the SDR's for 0 and 3. 402 is predicted. At 182, 50 is read. The anomaly score is low (0.1), which is a bit puzzling; however, it may be due to saturation. The prediction of 402 could represent a case where many columns were predicted representing a superposition of possible states, and 402 was just the strongest one (i.e., had the highest overlap of the encoded SDR for 402 with the predicted columns). That is, 52 may have also been predicted, but to a lesser degree than 402. It may be helpful to look at how many columns are predicted vs. active in each step to see when this happens. If the number of predicted columns suddenly jumps, it means that the HTM is uncertain about the next step (or, that it sees many possible next steps given the current context). [1] https://github.com/numenta/nupic/blob/master/src/nupic/algorithms/anomaly.py Best regards, Daniel On Tue, Nov 3, 2015 at 4:23 AM, Wakan Tanka <[email protected]> wrote: > Hello NuPIC, > > Here > > http://lists.numenta.org/pipermail/nupic_lists.numenta.org/2015-November/012139.html > > is discussion about correct interpretation of NuPIC output which I would > like to extend. First I will provide short summary and then ask another > question. > > Consider following output: > > step,original,prediction,anomaly score > 175,0,0.0,0.32500000000000001 > 176,62,52.0,0.65000000000000002 > 177,402,0.0,1.0 > 178,0,0.0,0.125 > 179,402,0.0,1.0 > 180,0,0.0,0.0 > 181,3,402.0,0.050000000000000003 > 182,50,52.0,0.10000000000000001 > 183,68,13.0,0.90000000000000002 > > This is output of one step ahead prediction without using inference > shifter. It basically mean that the prediction made at step N is for step > N+1. Or in another words if the prediction is perfectly right then > prediction value at step N should correspond to the original value at step > N+1. > > Anomaly score can be viewed as confidence of prediction. For example, > NuPIC might only be 23% confident in the best prediction it gives, in which > case the anomaly score could be very high. This is the case of step 179 > where prediction is 0 and the original value on step 180 is 0. Note that > anomaly score on step 179 is 1.0. It means that NuPIC was not confident in > prediction, despite that the prediction was correct. > > Opposite situation happens on step 180 where prediction is 0 and the > original value on step 181 is 3. Note that anomaly score on step 180 is 0. > That means that NuPIC was quiet confident in prediction but it was not > correct. > > > Questions: > 1. Does anomaly score on given line also counts with the original value on > given line? For example anomaly score on this line > 181,3,402.0,0.050000000000000003 > take into account that 3 is the original value? Or it is computed without > respect to this value? > > 2. Is it possible to compute some kind of debug information reading > prediction and anomaly score? I mean something like this from NuPIC > perspective: > I'm 23% sure that next value will be 10 > I'm 27% sure that next value will be 20 > I'm 50% sure that next value will be 30 > > 3. Is OK to predict data for zero steps forward if I'm just interested in > the prediction accuracy? > > 4. Does NuPIC make some kind of look back? I mean if NuPIC was at step 180 > confident that next value will be 0 but later it shows that it was mistake > does NuPIC somehow recount the anomaly score from step 180 for further data > processing? Or this is done automatically in HTM? > > > PS: I've cross posted this question on SO to reach more people here > http://stackoverflow.com/questions/33495388/how-to-correctly-interpret-nupic-output-vol-2 > >
