We don’t have any definitive numbers on this.  In general the SP is tolerant
to a large range of sparsity in the input, but the actual numbers depend on
several things.  On the sparser end it is important that there are enough
active input bits for the SP recognize patterns.  The more input bits you
have the sparser the patterns can be and still have a sufficient number of
active input bits.  On the denser side of the scale I would expect the SP to
start breaking down by 50% active input bits, maybe earlier.

 

We have a method of determining if a trained SP is working well.  Recall
that the individual bits of the spatial pooler are trying to learn common
spatial patterns in the input.  We often refer to them as coincidence
detectors.  After training the SP you can look at how many valid synapses
each SP coincidence detector has.  The number of valid synapses tells you
how rich a spatial pattern this coincidence detector has learned.  For
example 5 or fewer valid synapses is not much of a spatial pattern and the
SP output will not be very stable.  This represents a system that has too
few SP coincidence detectors or a system where the input doesn’t contain
many repeating patterns.  For example, if you feed random patterns into the
SP it won’t find anything to learn and the coincidence detectors will have
few synapses.  The more valid synapses you find in the trained SP
coincidence detectors the better the job the SP is doing.  If there are not
enough total active input bits the SP won’t find rich patterns.  If the
input is too dense then the input patterns will likely overlap a lot and the
SP will have trouble separating them.

 

Again, in practice we found the SP to be tolerant, I am just talking about
the extremes.

 

Advanced topic:

Imagine the million bits coming from the retina, assume they have some
reasonable sparse activity, say 5%.  If we feed this into a plain vanilla
spatial pooler it won’t work well at all.  Even if we have 1M columns in the
SP, the number of patterns coming from the retina is so huge that each
column in the SP will be overwhelmed.  There are WAY more than 1M patterns
coming from the retina, it will look like noise.  However, we can fix this
problem by using topology.  When we implement the SP with topology the
individual coincidence detectors will limit the area of the input they look
at until they find rich spatial patterns.  The size of the area they look at
varies.  If the input patterns become less varied the input area of a
coincidence detector will expand.  If the input patterns become more varied
the input area will contract.  This is the basis of plasticity.  We tested
this and it worked beautifully.

Jeff

 

From: nupic [mailto:[email protected]] On Behalf Of Pedro
Tabacof
Sent: Tuesday, October 22, 2013 4:30 AM
To: NuPIC general mailing list.
Subject: Re: [nupic-dev] Looking for help in understanding part of the HTM
white paper

 

Hello,

 

Is there a recommended level for input sparsity? What is the minimum and
maximum sparsity it can work functionally with?

 

Thanks,
Pedro.

 

On Mon, Oct 21, 2013 at 6:28 PM, Jeff Hawkins <[email protected]> wrote:

Perhaps this wasn’t written as well as it should have been.

 

The spatial pooler converts one sparse representation into another sparse
representation.  The output of a spatial pooler has a fixed number of bits
(equal to the column number) and has a relatively fixed sparsity, say 2%.
The spatial pooler works just fine with a range in the number of input bits
and a range in sparsity.  In some ways the goal of the SP is handle any
amount of input, convert it to a fixed size and sparsity output.  The other
thing it does is learn the common spatial patterns in the input and make
sure to represent those well.

 

The output sparsity of the SP needs to be relatively fixed for the temporal
pooler (sequence memory) to work.  The number of output bits, equal to the
number of columns, also has to be fixed for the TP to work.

 

Why is it important that the input can vary?  In a real brain the hierarchy
of the neocortex is complicated and messy.  Multiple regions converge onto
destination regions as you ascend the hierarchy.  By allowing the number of
input bits to vary over a wide range, evolution could wire up the hierarchy
lots of different ways and the cortex continues to work ok.  If we took an
existing brain and then added a connection between two regions that
previously were not connected the SP in the destination region wouldn’t
break.  For example, in normal humans the size of primary visual cortex
varies by a factor of 3, but the size of the output of the retina is always
about 1M fibers.  The SP in V1 can handle a broad range in the ratio of the
number of input bits and  the number of output bits.

 

The sparsity level of the input can vary due to multiple reasons.  Lack of
sensory input, change in attention (which effectively turns off input bits),
and due to temporal pooling itself.  So it is important that the spatial
pooler take whatever it is given and converts it into a relatively fixed
output.

 

This is why the SP does what it does and why it is important.  Do you need
help understanding how the SP does this?

Jeff

 

From: nupic [mailto:[email protected]] On Behalf Of Jeff Fohl
Sent: Sunday, October 20, 2013 6:41 PM
To: [email protected]
Subject: [nupic-dev] Looking for help in understanding part of the HTM white
paper

 

Hello -

 

I hope this is not being posted to the wrong list. This is my first post
here. Please let me know if there is a more appropriate place for this
question.

 

In preparation for learning NuPIC, I have read "On Intelligence", and I am
now reading the HTM white paper put out by Numenta.

 

Making my way through the white paper, I got stuck on one passage, which I
can't really make sense of. Wondering if anyone can help me through this
part. The passage in question is on pages 11-12 of the white paper PDF -
specifically the second paragraph included below.

 

HTM regions also use sparse distributed representations. In fact, the memory
mechanisms within an HTM region are dependent on using sparse distributed
representations, and wouldn’t work otherwise. The input to an HTM region is
always a distributed representation, but it may not be sparse, so the first
thing an HTM region does is to convert its input into a sparse distributed
representation.

For example, a region might receive 20,000 input bits. The percentage of
input bits that are “1” and “0” might vary significantly over time. One time
there might be 5,000 “1” bits and another time there might be 9,000 “1”
bits. The HTM region could convert this input into an internal
representation of 10,000 bits of which 2%, or 200, are active at once,
regardless of how many of the input bits are “1”. As the input to the HTM
region varies over time, the internal representation also will change, but
there always will be about 200 bits out of 10,000 active. 

So, what exactly is going on here? How does a fluctuating input flow of
20,000 bits get converted into 200 bits? Obviously there is something
important going on here, but I don't understand what it is. Any help
illuminating this would be greatly appreciated!

Many thanks,

Jeff


_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org





 

-- 
Pedro Tabacof,
Unicamp - Eng. de Computação 08.

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Reply via email to