On Fri, Nov 22, 2013 at 6:55 AM, Marek Otahal <[email protected]> wrote:

> Guys,
>
> I want to run some benchmarks on the CLA, one of which includes what I
> called (information) capacity.
>
> This is #number of patterns a spatial pooler (SP) (with a fixed number of
> columns) (and probably fixed number of training rounds) can distinguish.
>
> So assuming I have a SP with 1000 columns and 2% sparsity (=20 cols ON at
> all times) and an encoder big enough to express larege range of patterns
> (say scalar encoder for 0...1.000.000.000).
>
> The top cap is (100 choose 20) which is some crazy number of 5*10^20. All
> these SDRs will be sparse, but not distributed (right??) because a change
> in one bit will already be another pattern.
>

The number of possible unique SP outputs is (1000 choose 20), or ~10^41.
These are all 2% sparsity. Changing one input bit doesn't necessary result
in a different SP output though. There could be many more input bit
patterns than combinations of 20 SP columns. For instance, 1000 input bits
have 10^300 possible patterns. And regardless of that, the semantic
information learned by the SP is distributed across the 1000 columns so it
would still be distributed.

>
> So my question is, what is the "usable" capacity where all outputs are
> still sparse (they all are) and distributed (=robust to noice). Is there a
> percentage of bits (say 20% bits chaotic and still recognizes the pattern
> still considered distributed/robust?)
>

This is still a valid question for real world datasets but is completely
dependent on the particular dataset. For instance, regardless of the SP
parameters, the dataset may have 10000 input bits but only ~50 of them
change regularly. The tolerance to noise at this point is limited by the
dataset.

>
>
> Or is it the other way around and the SP tries to maximize this robustnes
> for the given number of patterns it is presented? I if I feed it huge
> number of patterns I'll pay the obvious price of reducing the border
> between two patterns?
>

I think the answer to the first question is yes but to the second no. The
SP attempts to maximize the distance between the column input bits relative
to the actual data (rather than the entire input space). But feeding many
patterns in doesn't necessarily have an impact on this. If the input data
are not random, then the more data fed into the SP, I would expect the more
the columns will converge to the optimal representations.

>
> Either way, is there a reasonable way to measure what I defined a
> capacity?
>
> I was thinking like:
>
> for 10 repetitions:
>    for p in patterns_to_present:
>       sp.input(p)
>
> sp.disableLearning()
> for p in patterns_to_present:
>    p_mod = randomize_some_percentage_of_pattern(p, percentage)  # what
> should the percentage be? see above
>    if( sp.input(p) == sp.input(p_mod):
>         # ok, it's same, pattern learned
>

This seems like a good methodology for determining how tolerant the model
is to noise for this particular dataset. The amount of data fed in before
disabling learning will have a large impact on the noise tolerance (but
with diminishing returns).

>
>
> Thanks for your replies,
> Mark
>
>
> --
> Marek Otahal :o)
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>
_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Reply via email to