Re: [nupic-dev] Looking for help in understanding part of the HTM white paper

Ian Danforth Tue, 22 Oct 2013 16:57:26 -0700

Francisco,

 CEPT -> CLA is a very odd transition. I'm pretty sure the SP won't get you
anything useful.


CEPT Isn't a "Retina" it's the World

 Unlike CEPT, the actual retina is not an organized map. It has many copies
of the same small number of feature detectors. Each feature detector it has
is distributed (more or less) evenly across its surface.

 The world, on the other hand, has a coordinate system. Height, width,
depth, etc. Closeness in the real world are defined by these dimensions.
CEPT provides another definition for "closeness" in the context of words.
It is the world of words.

 An object in the CEPT world though is a very very strange thing indeed. It
never moves, and never changes, and because you don't recompute the CEPT
world again and again, you never get a different 'view' of the world.

 It's as if the only thing you could see in your entire world was an apple,
and you never saw it from any other angle, and it never moved or changed.

If the CEPT map is the world, and each word is an object in that world,
what would a retina be?

The Retina

 The retina exists as a set of predefined feature detectors evenly
distributed across its surface. Red, green, blue, and light. The ganglia
then add an initial processing step to get you a second set of evenly
distributed feature detectors for light/dark transitions and all the rest.

 The critical assumption here is that there are a set of common features
that could exist anywhere in the 2D projection of the world. The reason for
that, of course, is because our view changes over time, and objects move in
the real world. The retina evolved as a moving observation platform for a
dynamic world.

 The challenge of the retina and the cortical hierarchy is then to build
*invariant* representations of the world. Because the world is noisy and
dynamic you need all this circuitry to tease out the repeating patterns and
common causes. But the CEPT world isn't like that. It's totally invariant
to begin with. No single object ever moves or is viewed from a different
direction. A given word will always directly map to the same set of bits.

What *could* the SP pick up?

 What you're really hoping for is not that the SP/CLA will give you smaller
and smaller granularity, but that it will discover features that are common
across words. Another way to say this is, you hope that your
self-organizing map has accidentally captured dimensions other than those
used to calculate the centroids and distance metrics. If you already have
metrics along those other axis though, it doesn't make a lot of sense to
use the SP to try to discover them. You can calculate them directly.

Ultimately you want to know how a raven is like a writing desk. But to know
that "Poe wrote on both." you have to be able to perceive the world from a
very odd angle.

The TP

On the other hand the TP is a straight-forward sequence learner that takes
CEPTWorld like representations as inputs. You should easily get sentence
generation, hopefully with a fluent-aphasia like character. The sentences
should be nonsense but not garbage. Of course you could also do this with
n-grams.

Ian





On Tue, Oct 22, 2013 at 3:19 PM, Francisco Webber <[email protected]> wrote:

> Jeff,
> the SP with dynamic locality sounds exactly like the kind of configuration
> I was hoping to feed our word SDRs into. A position within a 2D word-SDR
> (described by two coordinates) corresponds to a specific semantic
> environment. Lets say the upper left area stands for nature, animals,
> plants etc and the lower right quadrant stands for sports and TV-events.
> This topology of the semantic space is captured within our CEPT Retina.
> If we now feed text about lions through the cept-retina into such a
> dynamic SP and if the column-inputs are organized in the same 2D way as the
> retina, the columns corresponding to the upper left area will become
> subdivided into smaller and smaller input fields while "reading" through
> the material. Same would be true in the lower right quadrant while reading
> the football news.
> The CLA layer will end up reflecting, with its input densities, the
> "semantic topology" that the retina has captured. A topological CLA layer
> can easily be extended to become a "natural" or implicit hierarchy, if one
> just symmetrically reduces the number of available columns in each higher
> layer while maintaining the given projection topology.
>
> Francisco
>
>
>
> On 22.10.2013, at 22:22, "Jeff Hawkins" <[email protected]> wrote:
>
> We don’t have any definitive numbers on this.  In general the SP is
> tolerant to a large range of sparsity in the input, but the actual numbers
> depend on several things.  On the sparser end it is important that there
> are enough active input bits for the SP recognize patterns.  The more input
> bits you have the sparser the patterns can be and still have a sufficient
> number of active input bits.  On the denser side of the scale I would
> expect the SP to start breaking down by 50% active input bits, maybe
> earlier.****
>
> We have a method of determining if a trained SP is working well.  Recall
> that the individual bits of the spatial pooler are trying to learn common
> spatial patterns in the input.  We often refer to them as coincidence
> detectors.  After training the SP you can look at how many valid synapses
> each SP coincidence detector has.  The number of valid synapses tells you
> how rich a spatial pattern this coincidence detector has learned.  For
> example 5 or fewer valid synapses is not much of a spatial pattern and the
> SP output will not be very stable.  This represents a system that has too
> few SP coincidence detectors or a system where the input doesn’t contain
> many repeating patterns.  For example, if you feed random patterns into the
> SP it won’t find anything to learn and the coincidence detectors will have
> few synapses.  The more valid synapses you find in the trained SP
> coincidence detectors the better the job the SP is doing.  If there are not
> enough total active input bits the SP won’t find rich patterns.  If the
> input is too dense then the input patterns will likely overlap a lot and
> the SP will have trouble separating them.****
>
> Again, in practice we found the SP to be tolerant, I am just talking about
> the extremes.****
>
> Advanced topic:****
> Imagine the million bits coming from the retina, assume they have some
> reasonable sparse activity, say 5%.  If we feed this into a plain vanilla
> spatial pooler it won’t work well at all.  Even if we have 1M columns in
> the SP, the number of patterns coming from the retina is so huge that each
> column in the SP will be overwhelmed.  There are WAY more than 1M patterns
> coming from the retina, it will look like noise.  However, we can fix this
> problem by using topology.  When we implement the SP with topology the
> individual coincidence detectors will limit the area of the input they look
> at until they find rich spatial patterns.  The size of the area they look
> at varies.  If the input patterns become less varied the input area of a
> coincidence detector will expand.  If the input patterns become more varied
> the input area will contract.  This is the basis of plasticity.  We tested
> this and it worked beautifully.****
> Jeff****
>
> *From:* nupic [mailto:[email protected]] *On Behalf Of *Pedro
> Tabacof
> *Sent:* Tuesday, October 22, 2013 4:30 AM
> *To:* NuPIC general mailing list.
> *Subject:* Re: [nupic-dev] Looking for help in understanding part of the
> HTM white paper****
> ** **
> Hello,****
> ** **
> Is there a recommended level for input sparsity? What is the minimum and
> maximum sparsity it can work functionally with?****
> ** **
> Thanks,
> Pedro.****
>
> ** **
> On Mon, Oct 21, 2013 at 6:28 PM, Jeff Hawkins <[email protected]>
> wrote:****
> Perhaps this wasn’t written as well as it should have been.****
>  ****
> The spatial pooler converts one sparse representation into another sparse
> representation.  The output of a spatial pooler has a fixed number of bits
> (equal to the column number) and has a relatively fixed sparsity, say 2%.
> The spatial pooler works just fine with a range in the number of input bits
> and a range in sparsity.  In some ways the goal of the SP is handle any
> amount of input, convert it to a fixed size and sparsity output.  The other
> thing it does is learn the common spatial patterns in the input and make
> sure to represent those well.****
>  ****
> The output sparsity of the SP needs to be relatively fixed for the
> temporal pooler (sequence memory) to work.  The number of output bits,
> equal to the number of columns, also has to be fixed for the TP to work.**
> **
>  ****
> Why is it important that the input can vary?  In a real brain the
> hierarchy of the neocortex is complicated and messy.  Multiple regions
> converge onto destination regions as you ascend the hierarchy.  By allowing
> the number of input bits to vary over a wide range, evolution could wire up
> the hierarchy lots of different ways and the cortex continues to work ok.
> If we took an existing brain and then added a connection between two
> regions that previously were not connected the SP in the destination region
> wouldn’t break.  For example, in normal humans the size of primary visual
> cortex varies by a factor of 3, but the size of the output of the retina is
> always about 1M fibers.  The SP in V1 can handle a broad range in the ratio
> of the number of input bits and  the number of output bits.****
>  ****
> The sparsity level of the input can vary due to multiple reasons.  Lack of
> sensory input, change in attention (which effectively turns off input
> bits), and due to temporal pooling itself.  So it is important that the
> spatial pooler take whatever it is given and converts it into a relatively
> fixed output.****
>  ****
> This is why the SP does what it does and why it is important.  Do you need
> help understanding how the SP does this?****
> Jeff****
>  ****
> *From:* nupic [mailto:[email protected]] *On Behalf Of *Jeff
> Fohl
> *Sent:* Sunday, October 20, 2013 6:41 PM
> *To:* [email protected]
> *Subject:* [nupic-dev] Looking for help in understanding part of the HTM
> white paper****
>  ****
> Hello -****
>  ****
> I hope this is not being posted to the wrong list. This is my first post
> here. Please let me know if there is a more appropriate place for this
> question.****
>  ****
> In preparation for learning NuPIC, I have read "On Intelligence", and I am
> now reading the HTM white paper put out by Numenta.****
>  ****
> Making my way through the white paper, I got stuck on one passage, which I
> can't really make sense of. Wondering if anyone can help me through this
> part. The passage in question is on pages 11-12 of the white paper PDF -
> specifically the second paragraph included below.****
>  ****
> *HTM regions also use sparse distributed representations. In fact, the
> memory mechanisms within an HTM region are dependent on using sparse
> distributed representations, and wouldn’t work otherwise. The input to an
> HTM region is always a distributed representation, but it may not be
> sparse, so the first thing an HTM region does is to convert its input into
> a sparse distributed representation.*****
> *For example, a region might receive 20,000 input bits. The percentage of
> input bits that are “1” and “0” might vary significantly over time. One
> time there might be 5,000 “1” bits and another time there might be 9,000
> “1” bits. The HTM region could convert this input into an internal
> representation of 10,000 bits of which 2%, or 200, are active at once,
> regardless of how many of the input bits are “1”. As the input to the HTM
> region varies over time, the internal representation also will change, but
> there always will be about 200 bits out of 10,000 active. *****
> So, what exactly is going on here? How does a fluctuating input flow of
> 20,000 bits get converted into 200 bits? Obviously there is something
> important going on here, but I don't understand what it is. Any help
> illuminating this would be greatly appreciated!****
> Many thanks,****
> Jeff****
>
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org****
>
>
> ****
> ** **
> --
> Pedro Tabacof,
> Unicamp - Eng. de Computação 08.****
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Re: [nupic-dev] Looking for help in understanding part of the HTM white paper

Reply via email to