Re: Anomaly Detection Files

Subutai Ahmad Fri, 24 Oct 2014 08:55:43 -0700

Hi Nicholas,

Ah, I misunderstood the experiment. (I thought you were comparing the SDR
for x against the SDR for y.)  Your experiment is a bit more sophisticated.
What are your encoder parameters here?


Thanks,

--Subutai

On Wed, Oct 22, 2014 at 12:17 PM, Nicholas Mitri <[email protected]>
wrote:

> Hi Subutai,
>
> Each feature is encoded separately then the feature vector is concatenated
> from both and fed to the SP, so (3,2) and (2,3) end up being different
> patterns with their own response to the SP processing (thus the
> non-symmetric contours).
>
> best,
> Nick
>
>
> On Oct 22, 2014, at 9:57 PM, Subutai Ahmad <[email protected]> wrote:
>
> Hi Nicholas,
>
> These are really great graphs, and a very nice way to do the analysis.
> Hopefully this can help improve our overall understanding of the SP, and
> help improve the algorithm implementations as well.
>
> In addition to experimenting with the settings, I realized that the way
> you train the SP also has a big impact (since it is a learning system). As
> a baseline, I suggest creating the graph with essentially a random SP, i.e.
> set the increment/decrements to 0. We should see "reasonable" curves for
> such an SP.
>
> One question: I don't understand why the 2D plots aren't symmetric. What
> is the exact meaning of the value of (2,3)? How do you compute the value of
> (2,3) vs (3,2)?
>
> --Subutai
>
> On Wed, Oct 22, 2014 at 10:27 AM, Nicholas Mitri <[email protected]>
> wrote:
>
>> Thanks Subutai, I’ll try the suggested settings.
>>
>> I prepared some supplementary figures for follow up on the discussion
>> below. For a 2D feature set, I plotted the contours that represent the
>> overlap with (0,0) as the reference point and grid points of the 2D space
>> as test samples. Here are 3 runs under the same settings followed by the
>> contours for euclidean and manhattan norms for reference. You’ll notice
>> that the overlap creates contours that are random and inconsistent in
>> shape. The values at the contours still decrease as you span out from the
>> center which is good (as shown by the color gradient). Unfortunately, it’s
>> very hard to make a case for spatial anomaly detection with non-uniform
>> contours.
>>
>> Sorry about the rough contours, matplotlib is acting up for finer mesh
>> grids.
>>
>> <contour1.png>
>> <contour2.png>
>> <contour3.png><eucContour.png>
>> <manContour.png>
>>
>>
>> On Oct 22, 2014, at 7:51 PM, Subutai Ahmad <[email protected]> wrote:
>>
>>
>> Thanks.
>>
>> Yes, your reasoning about potentialPct = 1 seems right. I too think that
>> setting doesn't work in general. As you say, a small number of columns can
>> start to dominate and become fully connected.
>>
>> I have had best luck with a number well above 0.5 though. With 0.5, only
>> about 25% will be initially connected. With w=21, that is only about 5 or 6
>> bits. 5 or 6 makes me uncomfortable - it is insufficient for reliable
>> performance and can cause random changes to have large effects. I would
>> suggest trying something like 0.8 or so.
>>
>> Also, in my experience I had better luck with a smaller learning rate.  I
>> would suggest trying synPermActiveInc of 0.001 and maybe synPermInactiveDec
>> about half of that.
>>
>> --Subutai
>>
>>
>> On Wed, Oct 22, 2014 at 9:01 AM, Nicholas Mitri <[email protected]>
>> wrote:
>>
>>> I was trying different configurations just now. The best results are
>>> achieved with
>>>
>>> SP(self.inputShape,
>>>                      self.columnDimensions,
>>>                      potentialRadius=self.inputSize,
>>>                      potentialPct=0.5,
>>>                      numActiveColumnsPerInhArea=int(self.sparsity *
>>> self.columnNumber),
>>>                      globalInhibition=True,
>>>                      synPermActiveInc=0.05)
>>>
>>> potentialPct and synPermActiveInc have the most impact on the results.
>>> Specifically, potentialPct set to 1 has a very negative effect on the SP’s
>>> behavior as seen below. I suspect setting this parameter to 1 and thus
>>> allowing all columns to “see” all inputs levels the field of competition
>>> and causes the top 2% set to change drastically from one input to the next.
>>> A lower setting on that parameter allows a dominant and more stable set of
>>> columns to be established, which would explain why the overlap drops
>>> gradually.
>>> <fig6.png>
>>>
>>> On Oct 22, 2014, at 6:47 PM, Subutai Ahmad <[email protected]> wrote:
>>>
>>> Hi Nichols,
>>>
>>> I think these are great tests to do. Can you share the SP parameters you
>>> used for this? What was the potential pct, learning rates, and inhibition?
>>>
>>> Thanks,
>>>
>>> --Subutai
>>>
>>> On Wed, Oct 22, 2014 at 6:45 AM, Nicholas Mitri <[email protected]>
>>> wrote:
>>>
>>>> Hey mark,
>>>>
>>>> To follow up on our discussion yesterday, I did a few tests on a
>>>> 1024-column SP with 128-bit long (w = 21) RDSE input.
>>>> I fed the network inputs in the range [1-20] and calculated the overlap
>>>> of the output of the encoder and the output of the SP with the
>>>> corresponding outputs for input = 1. The plots below show 3 different runs
>>>> under the same settings.
>>>>
>>>> The overlap at the encoder level is a straight line as expected since
>>>> the RDSE resolution is set to 1. The green plot compares the overlap at the
>>>> SP level.
>>>> Looking at these plots, it appears my statement about the assumption of
>>>> raw distance not translating to overlap is true. The good news is it seems
>>>> to be a rarity for the condition to break! Specifically, notice that in the
>>>> 3rd plot, input 13 has more overlap with 1 than 12, thus breaking the
>>>> condition. Also, notice the effect of random initialization on the shape of
>>>> the green plot which shows no consistent relation with the encoder overlap.
>>>>
>>>> Taking all this consideration, since the assumption seems to hold for
>>>> most cases and the SP overlap is non-increasing, I think we can leverage
>>>> the overlap for spatial anomaly detection as discussed earlier but I see
>>>> little promise of it performing well given the inconsistency of the overlap
>>>> metric.
>>>>
>>>> <fig3.png>
>>>> <fig4.png>
>>>> <fig5.png>
>>>>
>>>> On Oct 21, 2014, at 6:13 PM, Marek Otahal <[email protected]> wrote:
>>>>
>>>> Hello Nick,
>>>>
>>>>
>>>> On Tue, Oct 21, 2014 at 3:51 PM, Nicholas Mitri <[email protected]>
>>>> wrote:
>>>>>
>>>>> The relation extends past that, it’s ||input1,input2|| ->
>>>>> ||encSDR1,encSDR2|| -> ||colSDR1,colSDR2||.
>>>>>
>>>>> The assumption holds perfectly for the first transition. For the
>>>>> second, things get a bit muddy because ...
>>>>>
>>>>
>>>> Actually, I meant the 2nd transition ||inputs|| -> ||colSDR|| as
>>>> encoders do not produce SDRs but just vector representation of the input.
>>>>
>>>>
>>>>> of the *random partial* connectivity of columns. If for 2 different
>>>>> patterns the set of active columns only changes by 1 column (e.g. 
>>>>> {1,2,4,7}
>>>>> for pattern1 and {1,2,3,7} for pattern2), there’s no way for you to know 
>>>>> if
>>>>> that difference was caused by a change of 2 resolution steps or 5
>>>>> resolution steps at the level of the raw input. So, distances in raw
>>>>> feature space don’t translate to the high dimensional binary space of
>>>>> columnar activations.
>>>>>
>>>>
>>>> If the change 2vs5 resolution steps in raw input is significant
>>>> (according to encoder settings, SP avail resources (#cols)-vs-diversity of
>>>> patterns,...), you may not be able to tell from ||sdr1,sdr2|| exactly what
>>>> the ||.|| in raw inputs was, but the correlation (linear?) in
>>>> ||input,input|| < ||input,input2(diff 2 res. steps)|| < ||input, input5||
>>>> -->
>>>> ||sdr,sdr|| < ||sdr,sdr2|| < ||sdr,sdr5||; thus you could find a
>>>> threshold for anomaly/normal.
>>>>
>>>>
>>>>>
>>>>> If you’ve seen the thesis chapter I posted here some time ago about
>>>>> using the SP for clustering, you can see what I describe here happening 
>>>>> for
>>>>> different runs of the clustering code. I’d get different clusters each 
>>>>> time
>>>>> with no predictable cluster shapes. I‘ll attach the clustering results 
>>>>> here
>>>>> for your reference.
>>>>>
>>>>
>>>> Not sure, I'll search for it and take a look!
>>>>
>>>>>
>>>>> If you extend the conclusions of the clustering experiments to spatial
>>>>> anomaly detection, I think it’s fair to assume that there is no way to use
>>>>> columnar activation to compute a continuous anomaly score in the same way
>>>>> you can for distance based anomaly detectors.
>>>>>
>>>> Would this imply "anomaly in nupic does not work"? Because if we
>>>> assumed it's impossible to get anomaly score from a lower layer - SP, can
>>>> we do that in TP which takes the former as input?
>>>>
>>>> > I‘ll attach the clustering results here for your reference.
>>>>
>>>> I like the visualization! :) What is the problem with it? the clusters
>>>> are more or less same shape and same (spatial) distribution. If the problem
>>>> is there's essentially no overlap? So you cant say Red and Green cluster
>>>> are closer than Red and Blue, as 1~3 is closer than 1~5? I think it's
>>>> because the SP is not saturated enough (for it's size vs the small input
>>>> range).
>>>>
>>>> This might be hard to visualize, as you need enough cols for SP to work
>>>> well (those default ~2048), it has "too much" capacity, so the clusters are
>>>> totally distinct. Would be interesting if you visualize same SP trained on
>>>> 2000 numbers? (maybe more/less?)
>>>>
>>>> regards, Mark
>>>>
>>>> --
>>>> Marek Otahal :o)
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>

Re: Anomaly Detection Files

Reply via email to