Re: Anomaly Detection Files

Marek Otahal Tue, 21 Oct 2014 08:15:00 -0700

Hello Nick,


On Tue, Oct 21, 2014 at 3:51 PM, Nicholas Mitri <[email protected]> wrote:
>
> The relation extends past that, it’s ||input1,input2|| ->
> ||encSDR1,encSDR2|| -> ||colSDR1,colSDR2||.
>
> The assumption holds perfectly for the first transition. For the second,
> things get a bit muddy because ...
>

Actually, I meant the 2nd transition ||inputs|| -> ||colSDR|| as encoders
do not produce SDRs but just vector representation of the input.


> of the *random partial* connectivity of columns. If for 2 different
> patterns the set of active columns only changes by 1 column (e.g. {1,2,4,7}
> for pattern1 and {1,2,3,7} for pattern2), there’s no way for you to know if
> that difference was caused by a change of 2 resolution steps or 5
> resolution steps at the level of the raw input. So, distances in raw
> feature space don’t translate to the high dimensional binary space of
> columnar activations.
>

If the change 2vs5 resolution steps in raw input is significant (according
to encoder settings, SP avail resources (#cols)-vs-diversity of
patterns,...), you may not be able to tell from ||sdr1,sdr2|| exactly what
the ||.|| in raw inputs was, but the correlation (linear?) in
||input,input|| < ||input,input2(diff 2 res. steps)|| < ||input, input5||
-->
||sdr,sdr|| < ||sdr,sdr2|| < ||sdr,sdr5||; thus you could find a threshold
for anomaly/normal.


>
> If you’ve seen the thesis chapter I posted here some time ago about using
> the SP for clustering, you can see what I describe here happening for
> different runs of the clustering code. I’d get different clusters each time
> with no predictable cluster shapes. I‘ll attach the clustering results here
> for your reference.
>

Not sure, I'll search for it and take a look!

>
> If you extend the conclusions of the clustering experiments to spatial
> anomaly detection, I think it’s fair to assume that there is no way to use
> columnar activation to compute a continuous anomaly score in the same way
> you can for distance based anomaly detectors.
>
Would this imply "anomaly in nupic does not work"? Because if we assumed
it's impossible to get anomaly score from a lower layer - SP, can we do
that in TP which takes the former as input?

> I‘ll attach the clustering results here for your reference.

I like the visualization! :) What is the problem with it? the clusters are
more or less same shape and same (spatial) distribution. If the problem is
there's essentially no overlap? So you cant say Red and Green cluster are
closer than Red and Blue, as 1~3 is closer than 1~5? I think it's because
the SP is not saturated enough (for it's size vs the small input range).

This might be hard to visualize, as you need enough cols for SP to work
well (those default ~2048), it has "too much" capacity, so the clusters are
totally distinct. Would be interesting if you visualize same SP trained on
2000 numbers? (maybe more/less?)

regards, Mark

-- 
Marek Otahal :o)

Re: Anomaly Detection Files

Reply via email to