Re: [nupic-discuss] Benchmarking CLA: information capacity of Spatial Pooler

Fergal Byrne Sat, 23 Nov 2013 06:33:58 -0800

Hi Marek,

We had someone looking at this recently, they made a good stab at modelling
this kind of capacity question, but I thought there would be a better way
to do this.

We should use the standard 2K columns for reference. There are 2.4 * 10^84
possible 2% activation patterns with a 2K region (this does not include the
cells per column, which for 32 gives 5.5 * 10^144 patterns!).

As we discussed before, we should think in terms of confidence when
evaluating capacity. In other words, if we are checking to see if a pattern
is represented, what is the probability that it appears by chance rather
than actually stored? To calculate this, assume the pattern is not stored,
and now calculate how often it will appear as a result of the combination
of bits from other stored patterns. This will give us an indication of the
number of patterns which need to be stored to give us a certain probability
of seeing a false pattern.

Let's assume we want 95% confidence that a pattern is real. This means that
5% of the time the pattern is created by chance (ie the other patterns
happen to produce all the bits in our pattern). How many patterns need to
be there (ie turning on all their bits) for our bits to be on 5% of the
time?

We assume that each bit is equally likely to be on (this is probably wrong
in practice, but we'll need to make the assumption). Then, 1/2048 of the
patterns will include any given bit. In other words, 2047/2048 of the time
you store a pattern, the bit is off (and is on only if our pattern is the
cause).

So, let's say we start of with no patterns stored, and we add patterns at
random until there is a 5% chance we have a false match with a given
pattern. This is the number of patterns which represents the storage
capacity limit at 95% confidence.

At the beginning, there are no patterns so the probability is 0. For each
pattern we add, there is an additional 1/2048 chance that a given bit has
been switched on by now. So, after (0.05 / (1/2048)) = 102.4 additions,
there is a 5% chance that the bit is on. Assuming independence of bits
(again a big assumption), we'd need 39 further sets of such trials before
we had a 5% chance of all the bits being on. This is a total of 4096
patterns which would need to simultaneously appear before our pattern has a
5% chance of showing up.

To generalise, we have N columns and a fraction n are turned on. We want to
have a fraction p confidence that our pattern is really present. The number
of patterns you have to add to make a given pattern appear is then:

(1-p) * N * (n * N) = n * N^2 * (1-p)

The capacity of the SDR is thus quadratic in the number of columns, and is
proportional to both the sparcity (1/n) and the error tolerance (1-p).

Here's a plot of the capacity at 90%, 95% and 99% confidence levels versus
region size:

[image: Inline image 1]

Here's the Mathematica code which generated that:

 cap[N_, p_, n_] := n * N^2 * (1 - p)

Plot[Evaluate[Table[cap[N, p, .02], {p, {0.9, 0.95, 0.99}}]], {N, 512,
2048},
 Filling -> Axis,
 PlotLegend -> {"90%", "95%", "99%"}, LegendPosition ->  {1, -0.0},
 LegendShadow -> None, LegendBorder -> None,
 AxesLabel -> {"Columns", "Patterns"}, GridLines -> Automatic]
Regards,

Fergal Byrne

On Fri, Nov 22, 2013 at 2:55 PM, Marek Otahal <[email protected]> wrote:

> Guys,
>
> I want to run some benchmarks on the CLA, one of which includes what I
> called (information) capacity.
>
> This is #number of patterns a spatial pooler (SP) (with a fixed number of
> columns) (and probably fixed number of training rounds) can distinguish.
>
> So assuming I have a SP with 1000 columns and 2% sparsity (=20 cols ON at
> all times) and an encoder big enough to express larege range of patterns
> (say scalar encoder for 0...1.000.000.000).
>
> The top cap is (100 choose 20) which is some crazy number of 5*10^20. All
> these SDRs will be sparse, but not distributed (right??) because a change
> in one bit will already be another pattern.
>
> So my question is, what is the "usable" capacity where all outputs are
> still sparse (they all are) and distributed (=robust to noice). Is there a
> percentage of bits (say 20% bits chaotic and still recognizes the pattern
> still considered distributed/robust?)
>
>
> Or is it the other way around and the SP tries to maximize this robustnes
> for the given number of patterns it is presented? I if I feed it huge
> number of patterns I'll pay the obvious price of reducing the border
> between two patterns?
>
> Either way, is there a reasonable way to measure what I defined a
> capacity?
>
> I was thinking like:
>
> for 10 repetitions:
>    for p in patterns_to_present:
>       sp.input(p)
>
> sp.disableLearning()
> for p in patterns_to_present:
>    p_mod = randomize_some_percentage_of_pattern(p, percentage)  # what
> should the percentage be? see above
>    if( sp.input(p) == sp.input(p_mod):
>         # ok, it's same, pattern learned
>
>
> Thanks for your replies,
> Mark
>
>
> --
> Marek Otahal :o)
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>

-- 

Fergal Byrne, Brenter IT

<http://www.examsupport.ie>http://inbits.com - Better Living through
Thoughtful Technology

e:[email protected] t:+353 83 4214179
Formerly of Adnet [email protected] http://www.adnet.ie

<<SDR-Capacity.svg>>

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Re: [nupic-discuss] Benchmarking CLA: information capacity of Spatial Pooler

Reply via email to