On Fri, Nov 22, 2013 at 6:55 AM, Marek Otahal <[email protected]> wrote:
> Guys, > > I want to run some benchmarks on the CLA, one of which includes what I > called (information) capacity. > > This is #number of patterns a spatial pooler (SP) (with a fixed number of > columns) (and probably fixed number of training rounds) can distinguish. > > So assuming I have a SP with 1000 columns and 2% sparsity (=20 cols ON at > all times) and an encoder big enough to express larege range of patterns > (say scalar encoder for 0...1.000.000.000). > > The top cap is (100 choose 20) which is some crazy number of 5*10^20. All > these SDRs will be sparse, but not distributed (right??) because a change > in one bit will already be another pattern. > The number of possible unique SP outputs is (1000 choose 20), or ~10^41. These are all 2% sparsity. Changing one input bit doesn't necessary result in a different SP output though. There could be many more input bit patterns than combinations of 20 SP columns. For instance, 1000 input bits have 10^300 possible patterns. And regardless of that, the semantic information learned by the SP is distributed across the 1000 columns so it would still be distributed. > > So my question is, what is the "usable" capacity where all outputs are > still sparse (they all are) and distributed (=robust to noice). Is there a > percentage of bits (say 20% bits chaotic and still recognizes the pattern > still considered distributed/robust?) > This is still a valid question for real world datasets but is completely dependent on the particular dataset. For instance, regardless of the SP parameters, the dataset may have 10000 input bits but only ~50 of them change regularly. The tolerance to noise at this point is limited by the dataset. > > > Or is it the other way around and the SP tries to maximize this robustnes > for the given number of patterns it is presented? I if I feed it huge > number of patterns I'll pay the obvious price of reducing the border > between two patterns? > I think the answer to the first question is yes but to the second no. The SP attempts to maximize the distance between the column input bits relative to the actual data (rather than the entire input space). But feeding many patterns in doesn't necessarily have an impact on this. If the input data are not random, then the more data fed into the SP, I would expect the more the columns will converge to the optimal representations. > > Either way, is there a reasonable way to measure what I defined a > capacity? > > I was thinking like: > > for 10 repetitions: > for p in patterns_to_present: > sp.input(p) > > sp.disableLearning() > for p in patterns_to_present: > p_mod = randomize_some_percentage_of_pattern(p, percentage) # what > should the percentage be? see above > if( sp.input(p) == sp.input(p_mod): > # ok, it's same, pattern learned > This seems like a good methodology for determining how tolerant the model is to noise for this particular dataset. The amount of data fed in before disabling learning will have a large impact on the noise tolerance (but with diminishing returns). > > > Thanks for your replies, > Mark > > > -- > Marek Otahal :o) > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org > >
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
