Re: [nupic-discuss] [nupic-dev] benchmarking CLA - capacity,...

Fergal Byrne Sat, 07 Dec 2013 14:36:19 -0800

Hi Marek, Subutai,

Apologies for the radio silence over the last week or so, I've been
deep-diving something which I believe will be of interest to everyone -
more on this in the next few days as I still have some more experiments to
do..

First off, great that you're doing some investigation into this,
/Mar(e?)k/. It's really important that we grasp how powerful SDRs are, and
even more how efficient sequence learning of SDRs is. This is the key to
Jeff's theory - nature discovered this idea a few hundred million years
ago, and only perfected it when the big stone landed on the Yucatan, wiping
out all our competitors.

The key to all of this is probability (or conversely confidence). Any
learning system which has sufficient capacity will only ever encounter an
astronomically tiny proportion of possible sequences of learned patterns,
so when you see enough of a sequence you have seen the whole thing (at each
level of abstraction). This is the power of the TP. Each time you see a
pattern which you've predicted, the probability that the rest of the
sequence is not going to happen drops exponentially. When you do see
something new, it's happening so rarely that you have plenty of resources
to either learn it or flag a major breach of your model of the world.

<ad>NuPIC has a major bug: Jeff's CLA theory says that predictive cells
fire first, and for no good reason this is not in the SP. Fixing this is
both trivial (in terms of work) and significant (in terms of effectiveness
for many kinds of data). Anyone interested in adding the 10-15 lines of
Python and perhaps twice that of C++ might like to pick up my issue at
https://github.com/numenta/nupic/issues/415</ad>

The enemy of this is noise. /Mar(e?)k/'s example is nice and clean: load's
of distinguishable A's with one hard B at the end. Most data in the real
world is not like this, so you'll have many sequences with a spurious Æ, Å,
or Ā in there. We should be able to consider these semantic variants of A
and believe that we're still looking at a sequence of A's. The data should
decide how good this tolerance is - if we only ever get long sequences of
As (and their near relatives), followed by B's, then this is what we'll
learn. If the statistics are that we often get 40±5 As, then there will be
a prediction of 80% another A and 20% a B when you get to that point. But
the probability of any other character will be mathematically zero at that
stage, allowing you to rule out the significance of Æ, Å, or Ā being
anything other than noise.

/Mar(e?)k/, the numbers really tell here, as you hinted. The capacities are
in the 10^n where n is 40, 80, 120 or something, for even a small 2048
column layer. The real issue then becomes one of ensuring that these
capacities are exploited. This is all about presenting input data well, and
having a good enough learning algorithm so that the CLA will slice up the
data space well.

Regards,

Fergal Byrne

On Sat, Dec 7, 2013 at 9:37 PM, Subutai Ahmad <[email protected]> wrote:

> Hi Mark,
>
> We haven’t had too much discussion about the TP on this list but you ask
> some interesting questions below. We don’t really know a huge amount but we
> do know it can learn extremely long sequences. Consider that each
> transition in a sequence is represented by a number of segments. Each step
> at a minimum would consume at least activationThreshold segments since you
> need that many active columns to go on to the next step.  So, one limit
> with our typical configuration is (128 segments per cell * 32 cells per
> column * 2048)/activationThreshold. That’s a sequence about 1/2 million
> steps long! We could theoretically “hand construct” a sequence that long
> and it should work.
>
> In practice the length is likely to be a lot lower but it’s still probably
> pretty long (it would be interesting to try this out with random SDR’s).
>  The length is not really the problem. The difficulty of the sequences
> (like the one you have below) is more interesting. We have some tests
> already in NuPIC of lower order vs high order sequences. Please take a look
> at this file:
>
> nupic/tests/integration/py2/nupic/algorithms/tp_test.py
>
> It would be really cool to expand on this type of test.   There’s a lot
> more we could do to understand the TP better!
>
> —Subutai
>
>
> On Sun, Nov 17, 2013 at 3:00 PM, Marek Otahal <[email protected]>wrote:
>
>> I'm about to create and carry out some benchmarks of the CLA.
>>
>>
>> -for TP: given n-sequences, what's the max length f the sequences it can
>> recall?
>> -test with hardest sequences? (AAAAAAAAAAAAAAAAAAAAAAAAAAAAB)
>> -resistance to noice (I think Subutai did these? Could we have the
>> graphs, scripts, please?)
>>
>>
>>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>

-- 

Fergal Byrne, Brenter IT

<http://www.examsupport.ie>http://inbits.com - Better Living through
Thoughtful Technology

e:[email protected] t:+353 83 4214179
Formerly of Adnet [email protected] http://www.adnet.ie

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Re: [nupic-discuss] [nupic-dev] benchmarking CLA - capacity,...

Reply via email to