Thanks.

Yes, your reasoning about potentialPct = 1 seems right. I too think that
setting doesn't work in general. As you say, a small number of columns can
start to dominate and become fully connected.

I have had best luck with a number well above 0.5 though. With 0.5, only
about 25% will be initially connected. With w=21, that is only about 5 or 6
bits. 5 or 6 makes me uncomfortable - it is insufficient for reliable
performance and can cause random changes to have large effects. I would
suggest trying something like 0.8 or so.

Also, in my experience I had better luck with a smaller learning rate.  I
would suggest trying synPermActiveInc of 0.001 and maybe synPermInactiveDec
about half of that.

--Subutai


On Wed, Oct 22, 2014 at 9:01 AM, Nicholas Mitri <[email protected]> wrote:

> I was trying different configurations just now. The best results are
> achieved with
>
> SP(self.inputShape,
>                      self.columnDimensions,
>                      potentialRadius=self.inputSize,
>                      potentialPct=0.5,
>                      numActiveColumnsPerInhArea=int(self.sparsity *
> self.columnNumber),
>                      globalInhibition=True,
>                      synPermActiveInc=0.05)
>
> potentialPct and synPermActiveInc have the most impact on the results.
> Specifically, potentialPct set to 1 has a very negative effect on the SP’s
> behavior as seen below. I suspect setting this parameter to 1 and thus
> allowing all columns to “see” all inputs levels the field of competition
> and causes the top 2% set to change drastically from one input to the next.
> A lower setting on that parameter allows a dominant and more stable set of
> columns to be established, which would explain why the overlap drops
> gradually.
>
> On Oct 22, 2014, at 6:47 PM, Subutai Ahmad <[email protected]> wrote:
>
> Hi Nichols,
>
> I think these are great tests to do. Can you share the SP parameters you
> used for this? What was the potential pct, learning rates, and inhibition?
>
> Thanks,
>
> --Subutai
>
> On Wed, Oct 22, 2014 at 6:45 AM, Nicholas Mitri <[email protected]>
> wrote:
>
>> Hey mark,
>>
>> To follow up on our discussion yesterday, I did a few tests on a
>> 1024-column SP with 128-bit long (w = 21) RDSE input.
>> I fed the network inputs in the range [1-20] and calculated the overlap
>> of the output of the encoder and the output of the SP with the
>> corresponding outputs for input = 1. The plots below show 3 different runs
>> under the same settings.
>>
>> The overlap at the encoder level is a straight line as expected since the
>> RDSE resolution is set to 1. The green plot compares the overlap at the SP
>> level.
>> Looking at these plots, it appears my statement about the assumption of
>> raw distance not translating to overlap is true. The good news is it seems
>> to be a rarity for the condition to break! Specifically, notice that in the
>> 3rd plot, input 13 has more overlap with 1 than 12, thus breaking the
>> condition. Also, notice the effect of random initialization on the shape of
>> the green plot which shows no consistent relation with the encoder overlap.
>>
>> Taking all this consideration, since the assumption seems to hold for
>> most cases and the SP overlap is non-increasing, I think we can leverage
>> the overlap for spatial anomaly detection as discussed earlier but I see
>> little promise of it performing well given the inconsistency of the overlap
>> metric.
>>
>> <fig3.png>
>> <fig4.png>
>> <fig5.png>
>>
>> On Oct 21, 2014, at 6:13 PM, Marek Otahal <[email protected]> wrote:
>>
>> Hello Nick,
>>
>>
>> On Tue, Oct 21, 2014 at 3:51 PM, Nicholas Mitri <[email protected]>
>> wrote:
>>>
>>> The relation extends past that, it’s ||input1,input2|| ->
>>> ||encSDR1,encSDR2|| -> ||colSDR1,colSDR2||.
>>>
>>> The assumption holds perfectly for the first transition. For the second,
>>> things get a bit muddy because ...
>>>
>>
>> Actually, I meant the 2nd transition ||inputs|| -> ||colSDR|| as encoders
>> do not produce SDRs but just vector representation of the input.
>>
>>
>>> of the *random partial* connectivity of columns. If for 2 different
>>> patterns the set of active columns only changes by 1 column (e.g. {1,2,4,7}
>>> for pattern1 and {1,2,3,7} for pattern2), there’s no way for you to know if
>>> that difference was caused by a change of 2 resolution steps or 5
>>> resolution steps at the level of the raw input. So, distances in raw
>>> feature space don’t translate to the high dimensional binary space of
>>> columnar activations.
>>>
>>
>> If the change 2vs5 resolution steps in raw input is significant
>> (according to encoder settings, SP avail resources (#cols)-vs-diversity of
>> patterns,...), you may not be able to tell from ||sdr1,sdr2|| exactly what
>> the ||.|| in raw inputs was, but the correlation (linear?) in
>> ||input,input|| < ||input,input2(diff 2 res. steps)|| < ||input, input5||
>> -->
>> ||sdr,sdr|| < ||sdr,sdr2|| < ||sdr,sdr5||; thus you could find a
>> threshold for anomaly/normal.
>>
>>
>>>
>>> If you’ve seen the thesis chapter I posted here some time ago about
>>> using the SP for clustering, you can see what I describe here happening for
>>> different runs of the clustering code. I’d get different clusters each time
>>> with no predictable cluster shapes. I‘ll attach the clustering results here
>>> for your reference.
>>>
>>
>> Not sure, I'll search for it and take a look!
>>
>>>
>>> If you extend the conclusions of the clustering experiments to spatial
>>> anomaly detection, I think it’s fair to assume that there is no way to use
>>> columnar activation to compute a continuous anomaly score in the same way
>>> you can for distance based anomaly detectors.
>>>
>> Would this imply "anomaly in nupic does not work"? Because if we assumed
>> it's impossible to get anomaly score from a lower layer - SP, can we do
>> that in TP which takes the former as input?
>>
>> > I‘ll attach the clustering results here for your reference.
>>
>> I like the visualization! :) What is the problem with it? the clusters
>> are more or less same shape and same (spatial) distribution. If the problem
>> is there's essentially no overlap? So you cant say Red and Green cluster
>> are closer than Red and Blue, as 1~3 is closer than 1~5? I think it's
>> because the SP is not saturated enough (for it's size vs the small input
>> range).
>>
>> This might be hard to visualize, as you need enough cols for SP to work
>> well (those default ~2048), it has "too much" capacity, so the clusters are
>> totally distinct. Would be interesting if you visualize same SP trained on
>> 2000 numbers? (maybe more/less?)
>>
>> regards, Mark
>>
>> --
>> Marek Otahal :o)
>>
>>
>>
>
>

Reply via email to