Hi Doug, Thanks for the pointer. I like Kanerva’s work a lot. It is foundational stuff and I would like our work to build on that theory. I believe that particular section demonstrates a different type of capacity, i.e. how many patterns you can store simultaneously in an SDR. That is more related to this discussion on capacity:
http://lists.numenta.org/pipermail/nupic_lists.numenta.org/2013-November/001823.html To my knowledge, I don’t think Kanerva really considered the notion of semantic similarity in his work. This is very important for CLA’s. Also he tends to use Hamming distance which is incorrect. I believe overlap distance is the right metric to use for CLA’s. I don’t know his work that well though. Are my statements true? —Subutai On Sat, Nov 23, 2013 at 8:44 AM, Doug King <[email protected]> wrote: > A lot of the theoretical work on capacity has been done by Kanerva and can > be referenced here: > http://www.rni.org/kanerva/sdmchapter-text.pdf > See the section - 3.3.7. Memory Capacity. > > > On Sat, Nov 23, 2013 at 6:32 AM, Fergal Byrne <[email protected] > > wrote: > >> Hi Marek, >> >> We had someone looking at this recently, they made a good stab at >> modelling this kind of capacity question, but I thought there would be a >> better way to do this. >> >> We should use the standard 2K columns for reference. There are 2.4 * >> 10^84 possible 2% activation patterns with a 2K region (this does not >> include the cells per column, which for 32 gives 5.5 * 10^144 patterns!). >> >> As we discussed before, we should think in terms of confidence when >> evaluating capacity. In other words, if we are checking to see if a pattern >> is represented, what is the probability that it appears by chance rather >> than actually stored? To calculate this, assume the pattern is not stored, >> and now calculate how often it will appear as a result of the combination >> of bits from other stored patterns. This will give us an indication of the >> number of patterns which need to be stored to give us a certain probability >> of seeing a false pattern. >> >> Let's assume we want 95% confidence that a pattern is real. This means >> that 5% of the time the pattern is created by chance (ie the other patterns >> happen to produce all the bits in our pattern). How many patterns need to >> be there (ie turning on all their bits) for our bits to be on 5% of the >> time? >> >> We assume that each bit is equally likely to be on (this is probably >> wrong in practice, but we'll need to make the assumption). Then, 1/2048 of >> the patterns will include any given bit. In other words, 2047/2048 of the >> time you store a pattern, the bit is off (and is on only if our pattern is >> the cause). >> >> So, let's say we start of with no patterns stored, and we add patterns at >> random until there is a 5% chance we have a false match with a given >> pattern. This is the number of patterns which represents the storage >> capacity limit at 95% confidence. >> >> At the beginning, there are no patterns so the probability is 0. For each >> pattern we add, there is an additional 1/2048 chance that a given bit has >> been switched on by now. So, after (0.05 / (1/2048)) = 102.4 additions, >> there is a 5% chance that the bit is on. Assuming independence of bits >> (again a big assumption), we'd need 39 further sets of such trials before >> we had a 5% chance of all the bits being on. This is a total of 4096 >> patterns which would need to simultaneously appear before our pattern has a >> 5% chance of showing up. >> >> To generalise, we have N columns and a fraction n are turned on. We want >> to have a fraction p confidence that our pattern is really present. The >> number of patterns you have to add to make a given pattern appear is then: >> >> (1-p) * N * (n * N) = n * N^2 * (1-p) >> >> The capacity of the SDR is thus quadratic in the number of columns, and >> is proportional to both the sparcity (1/n) and the error tolerance (1-p). >> >> Here's a plot of the capacity at 90%, 95% and 99% confidence levels >> versus region size: >> >> [image: Inline image 1] >> >> Here's the Mathematica code which generated that: >> >> cap[N_, p_, n_] := n * N^2 * (1 - p) >> >> Plot[Evaluate[Table[cap[N, p, .02], {p, {0.9, 0.95, 0.99}}]], {N, 512, >> 2048}, >> Filling -> Axis, >> PlotLegend -> {"90%", "95%", "99%"}, LegendPosition -> {1, -0.0}, >> LegendShadow -> None, LegendBorder -> None, >> AxesLabel -> {"Columns", "Patterns"}, GridLines -> Automatic] >> Regards, >> >> Fergal Byrne >> >> >> >> On Fri, Nov 22, 2013 at 2:55 PM, Marek Otahal <[email protected]>wrote: >> >>> Guys, >>> >>> I want to run some benchmarks on the CLA, one of which includes what I >>> called (information) capacity. >>> >>> This is #number of patterns a spatial pooler (SP) (with a fixed number >>> of columns) (and probably fixed number of training rounds) can distinguish. >>> >>> So assuming I have a SP with 1000 columns and 2% sparsity (=20 cols ON >>> at all times) and an encoder big enough to express larege range of patterns >>> (say scalar encoder for 0...1.000.000.000). >>> >>> The top cap is (100 choose 20) which is some crazy number of 5*10^20. >>> All these SDRs will be sparse, but not distributed (right??) because a >>> change in one bit will already be another pattern. >>> >>> So my question is, what is the "usable" capacity where all outputs are >>> still sparse (they all are) and distributed (=robust to noice). Is there a >>> percentage of bits (say 20% bits chaotic and still recognizes the pattern >>> still considered distributed/robust?) >>> >>> >>> Or is it the other way around and the SP tries to maximize this >>> robustnes for the given number of patterns it is presented? I if I feed it >>> huge number of patterns I'll pay the obvious price of reducing the border >>> between two patterns? >>> >>> Either way, is there a reasonable way to measure what I defined a >>> capacity? >>> >>> I was thinking like: >>> >>> for 10 repetitions: >>> for p in patterns_to_present: >>> sp.input(p) >>> >>> sp.disableLearning() >>> for p in patterns_to_present: >>> p_mod = randomize_some_percentage_of_pattern(p, percentage) # what >>> should the percentage be? see above >>> if( sp.input(p) == sp.input(p_mod): >>> # ok, it's same, pattern learned >>> >>> >>> Thanks for your replies, >>> Mark >>> >>> >>> -- >>> Marek Otahal :o) >>> >>> _______________________________________________ >>> nupic mailing list >>> [email protected] >>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>> >>> >> >> >> -- >> >> Fergal Byrne, Brenter IT >> >> <http://www.examsupport.ie>http://inbits.com - Better Living through >> Thoughtful Technology >> >> e:[email protected] t:+353 83 4214179 >> Formerly of Adnet [email protected] http://www.adnet.ie >> >> _______________________________________________ >> nupic mailing list >> [email protected] >> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >> >> > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org > >
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
