Re: [nupic-dev] Inter-layer plumbing

Hideaki Suzuki Fri, 30 Aug 2013 10:08:39 -0700

Jeff san,

Thank you very much for sharing the information.  Very insightful.


This OS community is the unique place to me where I can gain this kind of
knowledge.


2013/8/30 Jeff Hawkins <[email protected]>

> “Could you expand a little on what biological problem you're referring to
> here?****
>
> -Mike”****
>
> ** **
>
> Ok, but I suspect it is beyond most people’s interest level,  I don’t want
> to confuse anyone.  But for those that are interested….****
>
> ** **
>
> The neurons in the CLA can be in a “predictive state”.  Biologically this
> is a cell that is depolarized.****
>
> The neurons in the CLA can be in an “active state”.  Biologically this is
> equivalent to firing or generating one or more spikes.****
>
> These two states are sufficient for learning sequences, but not for
> temporal pooling.****
>
> The addition of temporal pooling requires a third state which I don’t like
> because it is a little tricky to make it work with real neurons.****
>
> ** **
>
> When we first implemented the CLA we started with sequence memory and
> everything worked fine.  After a bunch of testing we added temporal
> pooling.  With temporal pooling the cells learn to predict their feed
> forward activation earlier and earlier.  It works like this.  First a cell
> becomes active due to a feed forward input.  It then forms synapses that
> allow it to predict its activity one step in advance.  Later it becomes
> active one step in advance and then forms synapses that allow it to predict
> its activity two steps in advance, and so on.  (The system doesn’t require
> discreet steps but it is easier to think about it that way.)  Over repeated
> training, a cell learns to be active over longer and longer sequences of
> patterns.  This is cool for a number of reasons.  A cell will learn to be
> active for as much time as it can correctly predict its future activity.
> If the world consists of a few long repeatable sequences then cells will be
> active over long periods of time.  The data determines how much pooling a
> cell can do.  The more pooling that can be done at one level of the
> hierarchy the easier the job of the next level.  It also suggests why we
> can learn new tasks very quickly (i.e. learn a new sequence) but to master
> something, to make something second nature, requires many repetitions.  I
> mentioned this in On Intelligence when I said with practice knowledge gets
> represented lower and lower in the hierarchy.  As a region gets better at
> temporal pooling it frees the memory in the next region for more advanced
> inference.****
>
> ** **
>
> The problem is cells that are pooling over time must be active/spiking,
> not just depolarized as in sequence learning.  When cells become active by
> pooling in advance of feed forward activation, it messes up the sequence
> memory.  The CLA can’t tell the difference between activation because of a
> real world feed forward input and activation because of pooling.  What
> happens is the CLA doesn’t wait for real input and sequences runaway
> forward in time.****
>
> ** **
>
> For pooling to work the CLA needs to distinguish between cell activation
> due to feed forward input and cell activation due to pooling. We need two
> different states for an active cell.****
>
> ** **
>
> There is an elegant biological solution to this but the evidence is
> equivocal.  The solution is: when a cell is activated due to feedforward
> input it generates a short burst of action potentials, three to five.  It
> does this once and then stops.  When a cell is activated by pooling it
> generates a series of spaced out spikes.  Believe it or not there are quite
> a few papers that suggest this could be happening.  There is evidence of
> short bursts prior to a steady firing pattern.  The mini-bursts are in the
> literature, easy to find.  I spoke to several scientists and they report
> seeing them. Some claim they see them at the beginning of every trace.
> However, others say they never see the mini-busts.  The best evidence for
> mini-bursts is in layer 5 cells (yes the motor ones that also project up
> the hierarchy).  These cells are called “intrinsically bursting” cells to
> reflect this behavior.  For temporal pooling to work I think we also need
> to see this mini-bursting behavior in layer 3.  Mini-bursts are seen in
> layer 3 but not by everybody. The evidence is much spottier.  It is
> possible that all layer 3 cells exhibit this behavior and scientists are
> not reporting them.  Perhaps there are different classes of layer 3 cells
> and only some mini-burst.   I wish the evidence was more conclusive.****
>
> ** **
>
> For the mini-bursting hypothesis to be correct a cell has to behave
> differently when receiving a mini-burst than when receiving regular spaced
> spikes.  Here too the evidence is good.****
>
> ** **
>
> The synapses that form on distal dendrite branches (sequence and pooling
> memory synapses) are far more effective when they get a burst of quick
> spikes in a row.  A thin dendrite amplifies the effect of multiple spikes
> because thin dendrites don’t leak current quickly and they have low
> capacitance.  Thus a burst of spikes on multiple synapses may be necessary
> for our dendrite segment coincidence detector to work.  A single spike
> won’t do it.  If a cell produces single spikes(not mini-bursts) when
> activated by a distal dendrite branch then sequences won’t run away.  This
> is what we need, it solves our problem!****
>
> ** **
>
> Conversely, axons that project up the hierarchy form synapses on proximal
> dendrites (the SP synapses).  Here, because the synapses are close to the
> big cell body and the dendrites have large diameters there is large current
> leakage and low capacitance.  It has been shown that the first arriving
> spike on a proximal synapse has a large effect (depolarization) but
> subsequent spikes in a mini-burst have a much diminished effect.  This is
> good because we don’t want the spatial pooler in the higher region to be
> overly influenced by the mini-bursts.  We want the SP to look at all active
> axons equally, those that are mini-bursting and those that are single
> spiking via pooling.  This is another nice validation of the theory.****
>
> ** **
>
> If you have followed all of this you see that the mini-burst hypothesis
> solves the issues of pooling in a hierarchy and it is supported by a lot
> biological evidence.  It is a pretty cool explanation for why we see
> mini-bursts in layer 5 cells.  My only worry is that the evidence for
> mini-bursting in layer 3 cells is spotty.  If everyone said all layer 3
> cells are intrinsically bursting like forward projecting layer 5 cells I
> would be much happier.  All in all the theory holds together remarkably
> well and I don’t have another one, so I am sticking with it for now.****
>
> ** **
>
> Of course none of this matters for the SW implementation, but I have found
> over and over again that if you stray from the biology you will get lost.*
> ***
>
> Jeff****
>
> ** **
>
> ** **
>
> *From:* nupic [mailto:[email protected]] *On Behalf Of *Michael
> Ferrier
> *Sent:* Thursday, August 29, 2013 11:40 AM
> *To:* NuPIC general mailing list.
> *Subject:* Re: [nupic-dev] Inter-layer plumbing****
>
> ** **
>
> >> There is a biological problem with pooling the way we implemented that
> I never resolved.  So it is a work in progress.****
>
> ** **
>
> Hi Jeff,****
>
> ** **
>
> Could you expand a little on what biological problem you're referring to
> here?****
>
> ** **
>
> Thanks!****
>
> ** **
>
> -Mike****
>
>
> ****
>
> _____________
> Michael Ferrier
> Department of Cognitive, Linguistic and Psychological Sciences, Brown
> University
> [email protected]****
>
> ** **
>
> On Thu, Aug 29, 2013 at 2:29 PM, Jeff Hawkins <[email protected]>
> wrote:****
>
> Here are some thoughts about how to connect CLA’s in a hierarchy.****
>
>  ****
>
> Here are some things we know about the brain.****
>
>  ****
>
> - Layer 3 in the cortex is the primary input layer.  (Sometimes input goes
> to layer 4 and layer 3, but layer 4 projects mostly to layer 3 and layer 4
> doesn’t always exist.  So layer 3 is the primary input layer. It exists
> everywhere.  We will ignore layer 4 for now.)****
>
>  ****
>
> - I believe the CLA represents a good model of what is happening in layer
> 3.****
>
>  ****
>
> - The output (i.e. axons) of layer 3 cells project up the hierarchy
> connecting to the proximal dendrites (SP) of the next region’s layer 3.***
> *
>
>  ****
>
> - This isn’t the complete picture.  The axons  of cells in layer 5 (the
> ones that project to motor areas) spit in two and one branch also projects
> up the hierarchy to layer 3 in the next region.  If we aren’t trying to
> incorporate motor behavior then we can ignore layer 5 and say input goes
> from layer 3 to layer 3 to layer 3, etc.  Or CLA to CLA to CLA, etc.****
>
>  ****
>
> Each cell in layer 3 projects to the next region, so the input to a region
> is the output of all the cells in the previous region’s layer 3.  If we
> consider our default CLA size there would be 64K input bits to the next
> level in the hierarchy.   Because of the distributed nature of knowledge it
> isn’t necessary that all cells in layer 3 project to the next region, as
> long as a good portion do we should be ok.  But assume they all do.****
>
>  ****
>
> 64K is a lot of input bits but the SP in the receiving region can take any
> number of bits and map them onto any number of columns.   That is one of
> the nice features of the SP, it can map an input of any dimension and
> sparsity to an number of columns.****
>
>  ****
>
> That’s it for the “plumbing”.  Now comes the tricky part.****
>
>  ****
>
> We, and many others, believe that a large part of how we recognize things
> in different forms is the brain assumes that patterns that occur next to
> each other in time represent the same thing.  This is where the term
> “temporal pooler” comes from.  We want cells to respond to a sequence of
> patterns that occur over time even though the individual patterns don’t
> have common bits.  The classic case are cells in V1 that respond to a line
> moving across the retina.  These cells have learned to fire for a sequence
> of patterns (a line in different positions as it moves is a sequence).  The
> cell remains active during the sequence.  Thus the outputs of a region are
> changing more slowing than the inputs to a region.  This basic idea is
> assumed to be happening throughout the cortex.  Temporal pooling also makes
> more output bits active at the same time.  So instead of just 40 cells
> active out of 64K you might have hundreds.****
>
>  ****
>
> The CLA was designed to solve the temporal pooling problem.  When we were
> working on vision problems the temporal pooler was the key thing we were
> testing.  We have disabled this feature when using the CLA in a single
> region because makes the system slower.  The temporal pooler without the
> “pooling” is still needed for sequence learning.****
>
>  ****
>
> There is a biological problem with pooling the way we implemented that I
> never resolved.  So it is a work in progress.****
>
>  ****
>
> Conclusion:  to connect two CLAs together in a hierarchy, all the cells in
> the lower region become the input to the next region.  But there are some
> difficult issues you might need to understand to get good results depending
> on the problem.****
>
> Jeff****
>
>  ****
>
>  ****
>
>  ****
>
> *From:* nupic [mailto:[email protected]] *On Behalf Of *Tim
> Boudreau
> *Sent:* Wednesday, August 28, 2013 4:29 PM
> *To:* NuPIC
> *Subject:* [nupic-dev] Inter-layer plumbing****
>
>  ****
>
> Is there a general notion of how layers should be wired together, so that
> one layer becomes input to the next layer?****
>
>  ****
>
> It seems like input into one layer is pretty straightforward - in ascii
> art:****
>
>  ****
>
> bit bit bit bit bit bit bit bit****
>
>  |       |   |       |       |****
>
>  ------proximal dendrite w/ boost factor---> column****
>
>  ****
>
> But it's less clear****
>
>  - If we have the hierarchy input -> layer 1 -> layer 2, what constitutes
> an input bit to layer 2 - the activation of some combination of columns
> from layer 1?****
>
>  - How information about activation in level 2 should reinforce
> connections in layer 1****
>
>  ****
>
> Any thoughts?****
>
>  ****
>
> -Tim****
>
>  ****
>
> -- ****
>
> http://timboudreau.com****
>
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org****
>
> ** **
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Re: [nupic-dev] Inter-layer plumbing

Reply via email to