Thanks. Yes, your reasoning about potentialPct = 1 seems right. I too think that setting doesn't work in general. As you say, a small number of columns can start to dominate and become fully connected.
I have had best luck with a number well above 0.5 though. With 0.5, only about 25% will be initially connected. With w=21, that is only about 5 or 6 bits. 5 or 6 makes me uncomfortable - it is insufficient for reliable performance and can cause random changes to have large effects. I would suggest trying something like 0.8 or so. Also, in my experience I had better luck with a smaller learning rate. I would suggest trying synPermActiveInc of 0.001 and maybe synPermInactiveDec about half of that. --Subutai On Wed, Oct 22, 2014 at 9:01 AM, Nicholas Mitri <[email protected]> wrote: > I was trying different configurations just now. The best results are > achieved with > > SP(self.inputShape, > self.columnDimensions, > potentialRadius=self.inputSize, > potentialPct=0.5, > numActiveColumnsPerInhArea=int(self.sparsity * > self.columnNumber), > globalInhibition=True, > synPermActiveInc=0.05) > > potentialPct and synPermActiveInc have the most impact on the results. > Specifically, potentialPct set to 1 has a very negative effect on the SP’s > behavior as seen below. I suspect setting this parameter to 1 and thus > allowing all columns to “see” all inputs levels the field of competition > and causes the top 2% set to change drastically from one input to the next. > A lower setting on that parameter allows a dominant and more stable set of > columns to be established, which would explain why the overlap drops > gradually. > > On Oct 22, 2014, at 6:47 PM, Subutai Ahmad <[email protected]> wrote: > > Hi Nichols, > > I think these are great tests to do. Can you share the SP parameters you > used for this? What was the potential pct, learning rates, and inhibition? > > Thanks, > > --Subutai > > On Wed, Oct 22, 2014 at 6:45 AM, Nicholas Mitri <[email protected]> > wrote: > >> Hey mark, >> >> To follow up on our discussion yesterday, I did a few tests on a >> 1024-column SP with 128-bit long (w = 21) RDSE input. >> I fed the network inputs in the range [1-20] and calculated the overlap >> of the output of the encoder and the output of the SP with the >> corresponding outputs for input = 1. The plots below show 3 different runs >> under the same settings. >> >> The overlap at the encoder level is a straight line as expected since the >> RDSE resolution is set to 1. The green plot compares the overlap at the SP >> level. >> Looking at these plots, it appears my statement about the assumption of >> raw distance not translating to overlap is true. The good news is it seems >> to be a rarity for the condition to break! Specifically, notice that in the >> 3rd plot, input 13 has more overlap with 1 than 12, thus breaking the >> condition. Also, notice the effect of random initialization on the shape of >> the green plot which shows no consistent relation with the encoder overlap. >> >> Taking all this consideration, since the assumption seems to hold for >> most cases and the SP overlap is non-increasing, I think we can leverage >> the overlap for spatial anomaly detection as discussed earlier but I see >> little promise of it performing well given the inconsistency of the overlap >> metric. >> >> <fig3.png> >> <fig4.png> >> <fig5.png> >> >> On Oct 21, 2014, at 6:13 PM, Marek Otahal <[email protected]> wrote: >> >> Hello Nick, >> >> >> On Tue, Oct 21, 2014 at 3:51 PM, Nicholas Mitri <[email protected]> >> wrote: >>> >>> The relation extends past that, it’s ||input1,input2|| -> >>> ||encSDR1,encSDR2|| -> ||colSDR1,colSDR2||. >>> >>> The assumption holds perfectly for the first transition. For the second, >>> things get a bit muddy because ... >>> >> >> Actually, I meant the 2nd transition ||inputs|| -> ||colSDR|| as encoders >> do not produce SDRs but just vector representation of the input. >> >> >>> of the *random partial* connectivity of columns. If for 2 different >>> patterns the set of active columns only changes by 1 column (e.g. {1,2,4,7} >>> for pattern1 and {1,2,3,7} for pattern2), there’s no way for you to know if >>> that difference was caused by a change of 2 resolution steps or 5 >>> resolution steps at the level of the raw input. So, distances in raw >>> feature space don’t translate to the high dimensional binary space of >>> columnar activations. >>> >> >> If the change 2vs5 resolution steps in raw input is significant >> (according to encoder settings, SP avail resources (#cols)-vs-diversity of >> patterns,...), you may not be able to tell from ||sdr1,sdr2|| exactly what >> the ||.|| in raw inputs was, but the correlation (linear?) in >> ||input,input|| < ||input,input2(diff 2 res. steps)|| < ||input, input5|| >> --> >> ||sdr,sdr|| < ||sdr,sdr2|| < ||sdr,sdr5||; thus you could find a >> threshold for anomaly/normal. >> >> >>> >>> If you’ve seen the thesis chapter I posted here some time ago about >>> using the SP for clustering, you can see what I describe here happening for >>> different runs of the clustering code. I’d get different clusters each time >>> with no predictable cluster shapes. I‘ll attach the clustering results here >>> for your reference. >>> >> >> Not sure, I'll search for it and take a look! >> >>> >>> If you extend the conclusions of the clustering experiments to spatial >>> anomaly detection, I think it’s fair to assume that there is no way to use >>> columnar activation to compute a continuous anomaly score in the same way >>> you can for distance based anomaly detectors. >>> >> Would this imply "anomaly in nupic does not work"? Because if we assumed >> it's impossible to get anomaly score from a lower layer - SP, can we do >> that in TP which takes the former as input? >> >> > I‘ll attach the clustering results here for your reference. >> >> I like the visualization! :) What is the problem with it? the clusters >> are more or less same shape and same (spatial) distribution. If the problem >> is there's essentially no overlap? So you cant say Red and Green cluster >> are closer than Red and Blue, as 1~3 is closer than 1~5? I think it's >> because the SP is not saturated enough (for it's size vs the small input >> range). >> >> This might be hard to visualize, as you need enough cols for SP to work >> well (those default ~2048), it has "too much" capacity, so the clusters are >> totally distinct. Would be interesting if you visualize same SP trained on >> 2000 numbers? (maybe more/less?) >> >> regards, Mark >> >> -- >> Marek Otahal :o) >> >> >> > >
