Not to interrupt, but remember we determined that that setting
(potentialPct) won't work until the fix is merged?

On Wed, Oct 22, 2014 at 11:51 AM, Subutai Ahmad <[email protected]> wrote:

>
> Thanks.
>
> Yes, your reasoning about potentialPct = 1 seems right. I too think that
> setting doesn't work in general. As you say, a small number of columns can
> start to dominate and become fully connected.
>
> I have had best luck with a number well above 0.5 though. With 0.5, only
> about 25% will be initially connected. With w=21, that is only about 5 or 6
> bits. 5 or 6 makes me uncomfortable - it is insufficient for reliable
> performance and can cause random changes to have large effects. I would
> suggest trying something like 0.8 or so.
>
> Also, in my experience I had better luck with a smaller learning rate.  I
> would suggest trying synPermActiveInc of 0.001 and maybe synPermInactiveDec
> about half of that.
>
> --Subutai
>
>
> On Wed, Oct 22, 2014 at 9:01 AM, Nicholas Mitri <[email protected]>
> wrote:
>
>> I was trying different configurations just now. The best results are
>> achieved with
>>
>> SP(self.inputShape,
>>                      self.columnDimensions,
>>                      potentialRadius=self.inputSize,
>>                      potentialPct=0.5,
>>                      numActiveColumnsPerInhArea=int(self.sparsity *
>> self.columnNumber),
>>                      globalInhibition=True,
>>                      synPermActiveInc=0.05)
>>
>> potentialPct and synPermActiveInc have the most impact on the results.
>> Specifically, potentialPct set to 1 has a very negative effect on the SP’s
>> behavior as seen below. I suspect setting this parameter to 1 and thus
>> allowing all columns to “see” all inputs levels the field of competition
>> and causes the top 2% set to change drastically from one input to the next.
>> A lower setting on that parameter allows a dominant and more stable set of
>> columns to be established, which would explain why the overlap drops
>> gradually.
>>
>> On Oct 22, 2014, at 6:47 PM, Subutai Ahmad <[email protected]> wrote:
>>
>> Hi Nichols,
>>
>> I think these are great tests to do. Can you share the SP parameters you
>> used for this? What was the potential pct, learning rates, and inhibition?
>>
>> Thanks,
>>
>> --Subutai
>>
>> On Wed, Oct 22, 2014 at 6:45 AM, Nicholas Mitri <[email protected]>
>> wrote:
>>
>>> Hey mark,
>>>
>>> To follow up on our discussion yesterday, I did a few tests on a
>>> 1024-column SP with 128-bit long (w = 21) RDSE input.
>>> I fed the network inputs in the range [1-20] and calculated the overlap
>>> of the output of the encoder and the output of the SP with the
>>> corresponding outputs for input = 1. The plots below show 3 different runs
>>> under the same settings.
>>>
>>> The overlap at the encoder level is a straight line as expected since
>>> the RDSE resolution is set to 1. The green plot compares the overlap at the
>>> SP level.
>>> Looking at these plots, it appears my statement about the assumption of
>>> raw distance not translating to overlap is true. The good news is it seems
>>> to be a rarity for the condition to break! Specifically, notice that in the
>>> 3rd plot, input 13 has more overlap with 1 than 12, thus breaking the
>>> condition. Also, notice the effect of random initialization on the shape of
>>> the green plot which shows no consistent relation with the encoder overlap.
>>>
>>> Taking all this consideration, since the assumption seems to hold for
>>> most cases and the SP overlap is non-increasing, I think we can leverage
>>> the overlap for spatial anomaly detection as discussed earlier but I see
>>> little promise of it performing well given the inconsistency of the overlap
>>> metric.
>>>
>>> <fig3.png>
>>> <fig4.png>
>>> <fig5.png>
>>>
>>> On Oct 21, 2014, at 6:13 PM, Marek Otahal <[email protected]> wrote:
>>>
>>> Hello Nick,
>>>
>>>
>>> On Tue, Oct 21, 2014 at 3:51 PM, Nicholas Mitri <[email protected]>
>>> wrote:
>>>>
>>>> The relation extends past that, it’s ||input1,input2|| ->
>>>> ||encSDR1,encSDR2|| -> ||colSDR1,colSDR2||.
>>>>
>>>> The assumption holds perfectly for the first transition. For the
>>>> second, things get a bit muddy because ...
>>>>
>>>
>>> Actually, I meant the 2nd transition ||inputs|| -> ||colSDR|| as
>>> encoders do not produce SDRs but just vector representation of the input.
>>>
>>>
>>>> of the *random partial* connectivity of columns. If for 2 different
>>>> patterns the set of active columns only changes by 1 column (e.g. {1,2,4,7}
>>>> for pattern1 and {1,2,3,7} for pattern2), there’s no way for you to know if
>>>> that difference was caused by a change of 2 resolution steps or 5
>>>> resolution steps at the level of the raw input. So, distances in raw
>>>> feature space don’t translate to the high dimensional binary space of
>>>> columnar activations.
>>>>
>>>
>>> If the change 2vs5 resolution steps in raw input is significant
>>> (according to encoder settings, SP avail resources (#cols)-vs-diversity of
>>> patterns,...), you may not be able to tell from ||sdr1,sdr2|| exactly what
>>> the ||.|| in raw inputs was, but the correlation (linear?) in
>>> ||input,input|| < ||input,input2(diff 2 res. steps)|| < ||input, input5||
>>> -->
>>> ||sdr,sdr|| < ||sdr,sdr2|| < ||sdr,sdr5||; thus you could find a
>>> threshold for anomaly/normal.
>>>
>>>
>>>>
>>>> If you’ve seen the thesis chapter I posted here some time ago about
>>>> using the SP for clustering, you can see what I describe here happening for
>>>> different runs of the clustering code. I’d get different clusters each time
>>>> with no predictable cluster shapes. I‘ll attach the clustering results here
>>>> for your reference.
>>>>
>>>
>>> Not sure, I'll search for it and take a look!
>>>
>>>>
>>>> If you extend the conclusions of the clustering experiments to spatial
>>>> anomaly detection, I think it’s fair to assume that there is no way to use
>>>> columnar activation to compute a continuous anomaly score in the same way
>>>> you can for distance based anomaly detectors.
>>>>
>>> Would this imply "anomaly in nupic does not work"? Because if we assumed
>>> it's impossible to get anomaly score from a lower layer - SP, can we do
>>> that in TP which takes the former as input?
>>>
>>> > I‘ll attach the clustering results here for your reference.
>>>
>>> I like the visualization! :) What is the problem with it? the clusters
>>> are more or less same shape and same (spatial) distribution. If the problem
>>> is there's essentially no overlap? So you cant say Red and Green cluster
>>> are closer than Red and Blue, as 1~3 is closer than 1~5? I think it's
>>> because the SP is not saturated enough (for it's size vs the small input
>>> range).
>>>
>>> This might be hard to visualize, as you need enough cols for SP to work
>>> well (those default ~2048), it has "too much" capacity, so the clusters are
>>> totally distinct. Would be interesting if you visualize same SP trained on
>>> 2000 numbers? (maybe more/less?)
>>>
>>> regards, Mark
>>>
>>> --
>>> Marek Otahal :o)
>>>
>>>
>>>
>>
>>
>


-- 
*We find it hard to hear what another is saying because of how loudly "who
one is", speaks...*

Reply via email to