Not to interrupt, but remember we determined that that setting (potentialPct) won't work until the fix is merged?
On Wed, Oct 22, 2014 at 11:51 AM, Subutai Ahmad <[email protected]> wrote: > > Thanks. > > Yes, your reasoning about potentialPct = 1 seems right. I too think that > setting doesn't work in general. As you say, a small number of columns can > start to dominate and become fully connected. > > I have had best luck with a number well above 0.5 though. With 0.5, only > about 25% will be initially connected. With w=21, that is only about 5 or 6 > bits. 5 or 6 makes me uncomfortable - it is insufficient for reliable > performance and can cause random changes to have large effects. I would > suggest trying something like 0.8 or so. > > Also, in my experience I had better luck with a smaller learning rate. I > would suggest trying synPermActiveInc of 0.001 and maybe synPermInactiveDec > about half of that. > > --Subutai > > > On Wed, Oct 22, 2014 at 9:01 AM, Nicholas Mitri <[email protected]> > wrote: > >> I was trying different configurations just now. The best results are >> achieved with >> >> SP(self.inputShape, >> self.columnDimensions, >> potentialRadius=self.inputSize, >> potentialPct=0.5, >> numActiveColumnsPerInhArea=int(self.sparsity * >> self.columnNumber), >> globalInhibition=True, >> synPermActiveInc=0.05) >> >> potentialPct and synPermActiveInc have the most impact on the results. >> Specifically, potentialPct set to 1 has a very negative effect on the SP’s >> behavior as seen below. I suspect setting this parameter to 1 and thus >> allowing all columns to “see” all inputs levels the field of competition >> and causes the top 2% set to change drastically from one input to the next. >> A lower setting on that parameter allows a dominant and more stable set of >> columns to be established, which would explain why the overlap drops >> gradually. >> >> On Oct 22, 2014, at 6:47 PM, Subutai Ahmad <[email protected]> wrote: >> >> Hi Nichols, >> >> I think these are great tests to do. Can you share the SP parameters you >> used for this? What was the potential pct, learning rates, and inhibition? >> >> Thanks, >> >> --Subutai >> >> On Wed, Oct 22, 2014 at 6:45 AM, Nicholas Mitri <[email protected]> >> wrote: >> >>> Hey mark, >>> >>> To follow up on our discussion yesterday, I did a few tests on a >>> 1024-column SP with 128-bit long (w = 21) RDSE input. >>> I fed the network inputs in the range [1-20] and calculated the overlap >>> of the output of the encoder and the output of the SP with the >>> corresponding outputs for input = 1. The plots below show 3 different runs >>> under the same settings. >>> >>> The overlap at the encoder level is a straight line as expected since >>> the RDSE resolution is set to 1. The green plot compares the overlap at the >>> SP level. >>> Looking at these plots, it appears my statement about the assumption of >>> raw distance not translating to overlap is true. The good news is it seems >>> to be a rarity for the condition to break! Specifically, notice that in the >>> 3rd plot, input 13 has more overlap with 1 than 12, thus breaking the >>> condition. Also, notice the effect of random initialization on the shape of >>> the green plot which shows no consistent relation with the encoder overlap. >>> >>> Taking all this consideration, since the assumption seems to hold for >>> most cases and the SP overlap is non-increasing, I think we can leverage >>> the overlap for spatial anomaly detection as discussed earlier but I see >>> little promise of it performing well given the inconsistency of the overlap >>> metric. >>> >>> <fig3.png> >>> <fig4.png> >>> <fig5.png> >>> >>> On Oct 21, 2014, at 6:13 PM, Marek Otahal <[email protected]> wrote: >>> >>> Hello Nick, >>> >>> >>> On Tue, Oct 21, 2014 at 3:51 PM, Nicholas Mitri <[email protected]> >>> wrote: >>>> >>>> The relation extends past that, it’s ||input1,input2|| -> >>>> ||encSDR1,encSDR2|| -> ||colSDR1,colSDR2||. >>>> >>>> The assumption holds perfectly for the first transition. For the >>>> second, things get a bit muddy because ... >>>> >>> >>> Actually, I meant the 2nd transition ||inputs|| -> ||colSDR|| as >>> encoders do not produce SDRs but just vector representation of the input. >>> >>> >>>> of the *random partial* connectivity of columns. If for 2 different >>>> patterns the set of active columns only changes by 1 column (e.g. {1,2,4,7} >>>> for pattern1 and {1,2,3,7} for pattern2), there’s no way for you to know if >>>> that difference was caused by a change of 2 resolution steps or 5 >>>> resolution steps at the level of the raw input. So, distances in raw >>>> feature space don’t translate to the high dimensional binary space of >>>> columnar activations. >>>> >>> >>> If the change 2vs5 resolution steps in raw input is significant >>> (according to encoder settings, SP avail resources (#cols)-vs-diversity of >>> patterns,...), you may not be able to tell from ||sdr1,sdr2|| exactly what >>> the ||.|| in raw inputs was, but the correlation (linear?) in >>> ||input,input|| < ||input,input2(diff 2 res. steps)|| < ||input, input5|| >>> --> >>> ||sdr,sdr|| < ||sdr,sdr2|| < ||sdr,sdr5||; thus you could find a >>> threshold for anomaly/normal. >>> >>> >>>> >>>> If you’ve seen the thesis chapter I posted here some time ago about >>>> using the SP for clustering, you can see what I describe here happening for >>>> different runs of the clustering code. I’d get different clusters each time >>>> with no predictable cluster shapes. I‘ll attach the clustering results here >>>> for your reference. >>>> >>> >>> Not sure, I'll search for it and take a look! >>> >>>> >>>> If you extend the conclusions of the clustering experiments to spatial >>>> anomaly detection, I think it’s fair to assume that there is no way to use >>>> columnar activation to compute a continuous anomaly score in the same way >>>> you can for distance based anomaly detectors. >>>> >>> Would this imply "anomaly in nupic does not work"? Because if we assumed >>> it's impossible to get anomaly score from a lower layer - SP, can we do >>> that in TP which takes the former as input? >>> >>> > I‘ll attach the clustering results here for your reference. >>> >>> I like the visualization! :) What is the problem with it? the clusters >>> are more or less same shape and same (spatial) distribution. If the problem >>> is there's essentially no overlap? So you cant say Red and Green cluster >>> are closer than Red and Blue, as 1~3 is closer than 1~5? I think it's >>> because the SP is not saturated enough (for it's size vs the small input >>> range). >>> >>> This might be hard to visualize, as you need enough cols for SP to work >>> well (those default ~2048), it has "too much" capacity, so the clusters are >>> totally distinct. Would be interesting if you visualize same SP trained on >>> 2000 numbers? (maybe more/less?) >>> >>> regards, Mark >>> >>> -- >>> Marek Otahal :o) >>> >>> >>> >> >> > -- *We find it hard to hear what another is saying because of how loudly "who one is", speaks...*
