Re: [nupic-dev] Looking for help in understanding part of the HTM white paper

Jeff Hawkins Tue, 22 Oct 2013 16:38:36 -0700

Yes, now that you mention it using topology in the SP might be essential for
your word-SDRs.  Cepts word-SDRs are pretty big and have a 2D topology so a
basic SP without topology will be limited.


 

The way I think of this in regards to the retina is the chance of finding a
correlation between two bits from the retina is high if the two bits are
close to each other and quickly becomes low if the two bits are further
apart.  This property is true across the entire retina.  In the case of Cept
word-SDRs there might be discontinuities in the 2D map but the SP should
handle this automatically and not form coincidence detectors that span two
regions of the map if there are minimal correlations between the two
regions.

 

It is even cooler.  Say one region of the 2D map was all noise, no patterns
to detect.  What will happen is the coincident detectors that would
normally look at the noise region will migrate their connections to the
adjacent region where patterns can be found.  The SP will learn to ignore
the noise region and instead form better representations of the region with
patterns.  When the SP has topology all the coincident detectors are
constantly competing and jostling to find good patterns to represent.

 

I dont know if topology is still in the NuPIC code base.  I think it is but
I am not certain.  What we did was assign potential synapses to each
coincident detector.  The potential synapses were in a circular region of
the input space.  We assigned initial permanence values in with a Gaussian.
Inputs that are closest in the 2D map had the highest permanence values.  We
set the circular region of potential synapses fairly large.  In the real
brain dendrites and axons are always growing from the where they have made
good connections.  The potential synapse pool is dynamic and constantly
trying to expand in the area where good connections have already been found.
We didnt model this but it is a more efficient approach than just keeping a
large pool of potential synapses.

Jeff

 

From: nupic [mailto:[email protected]] On Behalf Of Francisco
Webber
Sent: Tuesday, October 22, 2013 3:19 PM
To: NuPIC general mailing list.
Subject: Re: [nupic-dev] Looking for help in understanding part of the HTM
white paper

 

Jeff,

the SP with dynamic locality sounds exactly like the kind of configuration I
was hoping to feed our word SDRs into. A position within a 2D word-SDR
(described by two coordinates) corresponds to a specific semantic
environment. Lets say the upper left area stands for nature, animals, plants
etc and the lower right quadrant stands for sports and TV-events. This
topology of the semantic space is captured within our CEPT Retina. 

If we now feed text about lions through the cept-retina into such a dynamic
SP and if the column-inputs are organized in the same 2D way as the retina,
the columns corresponding to the upper left area will become subdivided into
smaller and smaller input fields while "reading" through the material. Same
would be true in the lower right quadrant while reading the football news.

The CLA layer will end up reflecting, with its input densities, the
"semantic topology" that the retina has captured. A topological CLA layer
can easily be extended to become a "natural" or implicit hierarchy, if one
just symmetrically reduces the number of available columns in each higher
layer while maintaining the given projection topology.

 

Francisco

 

 

 

On 22.10.2013, at 22:22, "Jeff Hawkins" <[email protected]> wrote:





We dont have any definitive numbers on this.  In general the SP is tolerant
to a large range of sparsity in the input, but the actual numbers depend on
several things.  On the sparser end it is important that there are enough
active input bits for the SP recognize patterns.  The more input bits you
have the sparser the patterns can be and still have a sufficient number of
active input bits.  On the denser side of the scale I would expect the SP to
start breaking down by 50% active input bits, maybe earlier.

 

We have a method of determining if a trained SP is working well.  Recall
that the individual bits of the spatial pooler are trying to learn common
spatial patterns in the input.  We often refer to them as coincidence
detectors.  After training the SP you can look at how many valid synapses
each SP coincidence detector has.  The number of valid synapses tells you
how rich a spatial pattern this coincidence detector has learned.  For
example 5 or fewer valid synapses is not much of a spatial pattern and the
SP output will not be very stable.  This represents a system that has too
few SP coincidence detectors or a system where the input doesnt contain
many repeating patterns.  For example, if you feed random patterns into the
SP it wont find anything to learn and the coincidence detectors will have
few synapses.  The more valid synapses you find in the trained SP
coincidence detectors the better the job the SP is doing.  If there are not
enough total active input bits the SP wont find rich patterns.  If the
input is too dense then the input patterns will likely overlap a lot and the
SP will have trouble separating them.

 

Again, in practice we found the SP to be tolerant, I am just talking about
the extremes.

 

Advanced topic:

Imagine the million bits coming from the retina, assume they have some
reasonable sparse activity, say 5%.  If we feed this into a plain vanilla
spatial pooler it wont work well at all.  Even if we have 1M columns in the
SP, the number of patterns coming from the retina is so huge that each
column in the SP will be overwhelmed.  There are WAY more than 1M patterns
coming from the retina, it will look like noise.  However, we can fix this
problem by using topology.  When we implement the SP with topology the
individual coincidence detectors will limit the area of the input they look
at until they find rich spatial patterns.  The size of the area they look at
varies.  If the input patterns become less varied the input area of a
coincidence detector will expand.  If the input patterns become more varied
the input area will contract.  This is the basis of plasticity.  We tested
this and it worked beautifully.

Jeff

 

From: nupic [mailto:[email protected]] On Behalf Of Pedro
Tabacof
Sent: Tuesday, October 22, 2013 4:30 AM
To: NuPIC general mailing list.
Subject: Re: [nupic-dev] Looking for help in understanding part of the HTM
white paper

 

Hello,

 

Is there a recommended level for input sparsity? What is the minimum and
maximum sparsity it can work functionally with?

 

Thanks,
Pedro.

 

On Mon, Oct 21, 2013 at 6:28 PM, Jeff Hawkins <
<mailto:[email protected]> [email protected]> wrote:

Perhaps this wasnt written as well as it should have been.

 

The spatial pooler converts one sparse representation into another sparse
representation.  The output of a spatial pooler has a fixed number of bits
(equal to the column number) and has a relatively fixed sparsity, say 2%.
The spatial pooler works just fine with a range in the number of input bits
and a range in sparsity.  In some ways the goal of the SP is handle any
amount of input, convert it to a fixed size and sparsity output.  The other
thing it does is learn the common spatial patterns in the input and make
sure to represent those well.

 

The output sparsity of the SP needs to be relatively fixed for the temporal
pooler (sequence memory) to work.  The number of output bits, equal to the
number of columns, also has to be fixed for the TP to work.

 

Why is it important that the input can vary?  In a real brain the hierarchy
of the neocortex is complicated and messy.  Multiple regions converge onto
destination regions as you ascend the hierarchy.  By allowing the number of
input bits to vary over a wide range, evolution could wire up the hierarchy
lots of different ways and the cortex continues to work ok.  If we took an
existing brain and then added a connection between two regions that
previously were not connected the SP in the destination region wouldnt
break.  For example, in normal humans the size of primary visual cortex
varies by a factor of 3, but the size of the output of the retina is always
about 1M fibers.  The SP in V1 can handle a broad range in the ratio of the
number of input bits and  the number of output bits.

 

The sparsity level of the input can vary due to multiple reasons.  Lack of
sensory input, change in attention (which effectively turns off input bits),
and due to temporal pooling itself.  So it is important that the spatial
pooler take whatever it is given and converts it into a relatively fixed
output.

 

This is why the SP does what it does and why it is important.  Do you need
help understanding how the SP does this?

Jeff

 

From: nupic [mailto: <mailto:[email protected]>
[email protected]] On Behalf Of Jeff Fohl
Sent: Sunday, October 20, 2013 6:41 PM
To:  <mailto:[email protected]> [email protected]
Subject: [nupic-dev] Looking for help in understanding part of the HTM white
paper

 

Hello -

 

I hope this is not being posted to the wrong list. This is my first post
here. Please let me know if there is a more appropriate place for this
question.

 

In preparation for learning NuPIC, I have read "On Intelligence", and I am
now reading the HTM white paper put out by Numenta.

 

Making my way through the white paper, I got stuck on one passage, which I
can't really make sense of. Wondering if anyone can help me through this
part. The passage in question is on pages 11-12 of the white paper PDF -
specifically the second paragraph included below.

 

HTM regions also use sparse distributed representations. In fact, the memory
mechanisms within an HTM region are dependent on using sparse distributed
representations, and wouldnt work otherwise. The input to an HTM region is
always a distributed representation, but it may not be sparse, so the first
thing an HTM region does is to convert its input into a sparse distributed
representation.

For example, a region might receive 20,000 input bits. The percentage of
input bits that are 1 and 0 might vary significantly over time. One time
there might be 5,000 1 bits and another time there might be 9,000 1
bits. The HTM region could convert this input into an internal
representation of 10,000 bits of which 2%, or 200, are active at once,
regardless of how many of the input bits are 1. As the input to the HTM
region varies over time, the internal representation also will change, but
there always will be about 200 bits out of 10,000 active. 

So, what exactly is going on here? How does a fluctuating input flow of
20,000 bits get converted into 200 bits? Obviously there is something
important going on here, but I don't understand what it is. Any help
illuminating this would be greatly appreciated!

Many thanks,

Jeff


_______________________________________________
nupic mailing list
 <mailto:[email protected]> [email protected]
 <http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org>
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org





 

-- 
Pedro Tabacof,
Unicamp - Eng. de Computação 08.

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Re: [nupic-dev] Looking for help in understanding part of the HTM white paper

Reply via email to