Just to make sure I’m not pulling all this out of thin air, let me write some code to visualize contours for a 2D dataset much like the figure in this thread. I’ll run it under different settings to see how they change then I’ll run it for the same settings to investigate the effect of random initialization on the results. I’ll share the results when I have them.
best, Nick > On Oct 21, 2014, at 3:49 PM, Marek Otahal <[email protected]> wrote: > > Nick, > > > Here, we’re assuming that there’s a correlation between the distance > > between 2 raw patterns and their corresponding columnar activations. I’m > > not sure if this assumption holds for nupic’s SP implementation though. > > yes, the ||SDR1,SDR2|| ~~ ||input1,input2|| holds. The implementation of > closeness depends in an encoder and its settings. > > > ...and a set of columns produced by the training passes to act as a > > template (analog to the centroid of a cluster as mentioned above.. > > You could get this as a property of the columns, an "activation counter" or > something, simply, you can tell which columns have been more active than the > others in the past. > > On Tue, Oct 21, 2014 at 2:33 PM, Nicholas Mitri <[email protected] > <mailto:[email protected]>> wrote: > Hey Mark > > Just to clarify, when I reference anomaly detection outside the scope of > Nupic, it’s distance based anomaly/novelty detectors. This includes > Euclidean, Euc Normed, Euc scaled, Manhattan, Mahalanobis, …). > > The purpose of these simple detectors is to gain insight into how much you > can expect the feature vector of a single class to vary and then use that > knowledge to decide if a new feature pattern demonstrates “normal” variance > from the average or is too far and therefore anomalous. It’s all based in > distance and is spatial anomaly detection as its simplest. > The example you included is a multi-cluster scenario where you’d need to > cluster, learn the proper centroid for each cluster and then calculate > anomaly scores for each cluster separately. It’s an extension of the single > cluster definition I gave. > >> anomaly detector = diff between active columns and columns with high weights >> (commonly used)? > > It would be the difference between active columns and a set of columns > produced by the training passes to act as a template (analog to the centroid > of a cluster as mentioned above). Here, we’re assuming that there’s a > correlation between the distance between 2 raw patterns and their > corresponding columnar activations. I’m not sure if this assumption holds for > nupic’s SP implementation though. > > best, > Nick > >> On Oct 21, 2014, at 2:58 PM, Marek Otahal <[email protected] >> <mailto:[email protected]>> wrote: >> >> Hi Nick, >> >> thanks for explanations.. some comments below. >> >> On Tue, Oct 21, 2014 at 1:13 PM, Nicholas Mitri <[email protected] >> <mailto:[email protected]>> wrote: >> This is not traditional spatial anomaly detection where the purpose is to >> decide if a new input pattern falls within the RANGE of previously observed >> patterns. >> >> Hmm, I was unaware of such spatial anomaly definition, so if I understand it >> right: >> experienced: {1,2,3,101,102,103}, value 51 is normal, while 152 is >> anomalous? (1000 being anomaly is ok). >> >> I somehow don't like this definition (not sure why exactly now :)), maybe a >> "distance from significant clusters in observed data" would be better >> (?)(152, 51 have same anomaly score, and eg 10 has a low score). >> ...but if it has its uses, why not. >> >> >> Here’s a few excerpts from the wiki: >> >> "A non-temporal anomaly is defined as a combination of fields that doesn’t >> usually occur, independent of the history of the data.” >> Maybe we should update the wiki here, imho everything in CLA is dependent of >> history of the data (but not on the sequential order of the data, in this >> case) >> >> >> This formulation will produce high anomaly scores for patterns that haven’t >> been seen before even if they fall inside the cluster of older patterns. >> Essentially, it’s detecting rarity and not spatial distance. >> True, that is how CLA anomaly works now, maybe you could generate your >> training samples from uniform distribution (instead of just the edge cases)? >> >> >> Scott’s suggestion of using overlap instead is spatial anomaly detection in >> the traditional sense. >> I haven’t started testing out any code but I’d be interested in seeing if >> the SP can be used like a distance based anomaly detector. Specifically, I >> want to find out whether the spatial pattern stability can be used as an >> analog for a cluster centroid and thus compared to novel input to calculate >> anomaly. >> >> I see. I think this is what you both said, so distance based anomaly >> detector = diff between active columns and columns with high weights >> (commonly used)? >> We could turn this around and output distance anomaly as ratio of active >> columns with low weights. >> >> My main concern with that approach is that the anomaly detector will produce >> a centroid and a threshold that is used to calculate an anomaly score (think >> of sigmoid function with the threshold as the knee). In the SP, the only way >> to achieve that is to force stability for all training patterns and bake in >> the thresholds accordingly to use for testing patterns. >> >> Cheers, Mark >> >> -- >> Marek Otahal :o) > > > > > -- > Marek Otahal :o)
