[nupic-discuss] Benchmarking CLA: information capacity of Spatial Pooler

Marek Otahal Fri, 22 Nov 2013 06:55:44 -0800

Guys,

I want to run some benchmarks on the CLA, one of which includes what I
called (information) capacity.


This is #number of patterns a spatial pooler (SP) (with a fixed number of
columns) (and probably fixed number of training rounds) can distinguish.

So assuming I have a SP with 1000 columns and 2% sparsity (=20 cols ON at
all times) and an encoder big enough to express larege range of patterns
(say scalar encoder for 0...1.000.000.000).

The top cap is (100 choose 20) which is some crazy number of 5*10^20. All
these SDRs will be sparse, but not distributed (right??) because a change
in one bit will already be another pattern.

So my question is, what is the "usable" capacity where all outputs are
still sparse (they all are) and distributed (=robust to noice). Is there a
percentage of bits (say 20% bits chaotic and still recognizes the pattern
still considered distributed/robust?)


Or is it the other way around and the SP tries to maximize this robustnes
for the given number of patterns it is presented? I if I feed it huge
number of patterns I'll pay the obvious price of reducing the border
between two patterns?

Either way, is there a reasonable way to measure what I defined a capacity?

I was thinking like:

for 10 repetitions:
   for p in patterns_to_present:
      sp.input(p)

sp.disableLearning()
for p in patterns_to_present:
   p_mod = randomize_some_percentage_of_pattern(p, percentage)  # what
should the percentage be? see above
   if( sp.input(p) == sp.input(p_mod):
        # ok, it's same, pattern learned


Thanks for your replies,
Mark


-- 
Marek Otahal :o)

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

[nupic-discuss] Benchmarking CLA: information capacity of Spatial Pooler

Reply via email to