Mike, Ok, I am glad it is valuable. Matt also pulled me aside and said to stop worrying about it. When I give talks the neuroscience is often confusing to people so it has been suggested I avoid going into too much neuroscience detail. The neuroscience is hard to follow if you don't know the terminology or have experience with the concepts. From now on I won't shy away from it. Jeff
-----Original Message----- From: nupic [mailto:[email protected]] On Behalf Of Ralph Dratman Sent: Thursday, August 29, 2013 7:20 PM To: NuPIC general mailing list. Subject: Re: [nupic-dev] Inter-layer plumbing Jeff, Thanks much for this exposition. It is valuable. Is this material "beyond my interest level"? Absolutely not. I am very much interested. Is it beyond my current ability to comprehend? Ok -- admitted. I want to learn more. Ralph Dratman On Thu, Aug 29, 2013 at 6:53 PM, Jeff Hawkins <[email protected]> wrote: > “Could you expand a little on what biological problem you're referring > to here? > > -Mike” > > > > Ok, but I suspect it is beyond most people’s interest level, I don’t > want to confuse anyone. But for those that are interested…. > > > > The neurons in the CLA can be in a “predictive state”. Biologically > this is a cell that is depolarized. > > The neurons in the CLA can be in an “active state”. Biologically this > is equivalent to firing or generating one or more spikes. > > These two states are sufficient for learning sequences, but not for > temporal pooling. > > The addition of temporal pooling requires a third state which I don’t > like because it is a little tricky to make it work with real neurons. > > > > When we first implemented the CLA we started with sequence memory and > everything worked fine. After a bunch of testing we added temporal pooling. > With temporal pooling the cells learn to predict their feed forward > activation earlier and earlier. It works like this. First a cell > becomes active due to a feed forward input. It then forms synapses > that allow it to predict its activity one step in advance. Later it > becomes active one step in advance and then forms synapses that allow > it to predict its activity two steps in advance, and so on. (The > system doesn’t require discreet steps but it is easier to think about > it that way.) Over repeated training, a cell learns to be active over > longer and longer sequences of patterns. This is cool for a number of > reasons. A cell will learn to be active for as much time as it can > correctly predict its future activity. If the world consists of a few > long repeatable sequences then cells will be active over long periods > of time. The data determines how much pooling a cell can do. The > more pooling that can be done at one level of the hierarchy the easier > the job of the next level. It also suggests why we can learn new > tasks very quickly (i.e. learn a new sequence) but to master > something, to make something second nature, requires many repetitions. > I mentioned this in On Intelligence when I said with practice > knowledge gets represented lower and lower in the hierarchy. As a region > gets better at temporal pooling it frees the memory in the next region for > more advanced inference. > > > > The problem is cells that are pooling over time must be > active/spiking, not just depolarized as in sequence learning. When > cells become active by pooling in advance of feed forward activation, > it messes up the sequence memory. The CLA can’t tell the difference > between activation because of a real world feed forward input and > activation because of pooling. What happens is the CLA doesn’t wait > for real input and sequences runaway forward in time. > > > > For pooling to work the CLA needs to distinguish between cell > activation due to feed forward input and cell activation due to > pooling. We need two different states for an active cell. > > > > There is an elegant biological solution to this but the evidence is > equivocal. The solution is: when a cell is activated due to > feedforward input it generates a short burst of action potentials, > three to five. It does this once and then stops. When a cell is > activated by pooling it generates a series of spaced out spikes. > Believe it or not there are quite a few papers that suggest this could > be happening. There is evidence of short bursts prior to a steady > firing pattern. The mini-bursts are in the literature, easy to find. > I spoke to several scientists and they report seeing them. Some claim they > see them at the beginning of every trace. > However, others say they never see the mini-busts. The best evidence > for mini-bursts is in layer 5 cells (yes the motor ones that also > project up the hierarchy). These cells are called “intrinsically > bursting” cells to reflect this behavior. For temporal pooling to > work I think we also need to see this mini-bursting behavior in layer > 3. Mini-bursts are seen in layer 3 but not by everybody. The evidence > is much spottier. It is possible that all layer 3 cells exhibit this > behavior and scientists are not reporting them. Perhaps there are different > classes of layer 3 cells and only some > mini-burst. I wish the evidence was more conclusive. > > > > For the mini-bursting hypothesis to be correct a cell has to behave > differently when receiving a mini-burst than when receiving regular > spaced spikes. Here too the evidence is good. > > > > The synapses that form on distal dendrite branches (sequence and > pooling memory synapses) are far more effective when they get a burst > of quick spikes in a row. A thin dendrite amplifies the effect of > multiple spikes because thin dendrites don’t leak current quickly and > they have low capacitance. Thus a burst of spikes on multiple > synapses may be necessary for our dendrite segment coincidence > detector to work. A single spike won’t do it. If a cell produces > single spikes(not mini-bursts) when activated by a distal dendrite > branch then sequences won’t run away. This is what we need, it solves our > problem! > > > > Conversely, axons that project up the hierarchy form synapses on > proximal dendrites (the SP synapses). Here, because the synapses are > close to the big cell body and the dendrites have large diameters > there is large current leakage and low capacitance. It has been shown > that the first arriving spike on a proximal synapse has a large effect > (depolarization) but subsequent spikes in a mini-burst have a much > diminished effect. This is good because we don’t want the spatial > pooler in the higher region to be overly influenced by the > mini-bursts. We want the SP to look at all active axons equally, > those that are mini-bursting and those that are single spiking via pooling. > This is another nice validation of the theory. > > > > If you have followed all of this you see that the mini-burst > hypothesis solves the issues of pooling in a hierarchy and it is > supported by a lot biological evidence. It is a pretty cool > explanation for why we see mini-bursts in layer 5 cells. My only > worry is that the evidence for mini-bursting in layer 3 cells is > spotty. If everyone said all layer 3 cells are intrinsically bursting > like forward projecting layer 5 cells I would be much happier. All in > all the theory holds together remarkably well and I don’t have another one, > so I am sticking with it for now. > > > > Of course none of this matters for the SW implementation, but I have > found over and over again that if you stray from the biology you will get > lost. > > Jeff > > > > > > From: nupic [mailto:[email protected]] On Behalf Of > Michael Ferrier > Sent: Thursday, August 29, 2013 11:40 AM > To: NuPIC general mailing list. > Subject: Re: [nupic-dev] Inter-layer plumbing > > > >>> There is a biological problem with pooling the way we implemented >>> that I never resolved. So it is a work in progress. > > > > Hi Jeff, > > > > Could you expand a little on what biological problem you're referring > to here? > > > > Thanks! > > > > -Mike > > > _____________ > Michael Ferrier > Department of Cognitive, Linguistic and Psychological Sciences, Brown > University [email protected] > > > > On Thu, Aug 29, 2013 at 2:29 PM, Jeff Hawkins <[email protected]> wrote: > > Here are some thoughts about how to connect CLA’s in a hierarchy. > > > > Here are some things we know about the brain. > > > > - Layer 3 in the cortex is the primary input layer. (Sometimes input > goes to layer 4 and layer 3, but layer 4 projects mostly to layer 3 > and layer 4 doesn’t always exist. So layer 3 is the primary input > layer. It exists everywhere. We will ignore layer 4 for now.) > > > > - I believe the CLA represents a good model of what is happening in layer 3. > > > > - The output (i.e. axons) of layer 3 cells project up the hierarchy > connecting to the proximal dendrites (SP) of the next region’s layer 3. > > > > - This isn’t the complete picture. The axons of cells in layer 5 > (the ones that project to motor areas) spit in two and one branch also > projects up the hierarchy to layer 3 in the next region. If we aren’t > trying to incorporate motor behavior then we can ignore layer 5 and > say input goes from layer 3 to layer 3 to layer 3, etc. Or CLA to CLA to > CLA, etc. > > > > Each cell in layer 3 projects to the next region, so the input to a > region is the output of all the cells in the previous region’s layer > 3. If we consider our default CLA size there would be 64K input bits to the > next > level in the hierarchy. Because of the distributed nature of knowledge it > isn’t necessary that all cells in layer 3 project to the next region, > as long as a good portion do we should be ok. But assume they all do. > > > > 64K is a lot of input bits but the SP in the receiving region can take any > number of bits and map them onto any number of columns. That is one of the > nice features of the SP, it can map an input of any dimension and > sparsity to an number of columns. > > > > That’s it for the “plumbing”. Now comes the tricky part. > > > > We, and many others, believe that a large part of how we recognize > things in different forms is the brain assumes that patterns that > occur next to each other in time represent the same thing. This is > where the term “temporal pooler” comes from. We want cells to respond > to a sequence of patterns that occur over time even though the individual > patterns don’t have common bits. > The classic case are cells in V1 that respond to a line moving across > the retina. These cells have learned to fire for a sequence of > patterns (a line in different positions as it moves is a sequence). > The cell remains active during the sequence. Thus the outputs of a > region are changing more slowing than the inputs to a region. This > basic idea is assumed to be happening throughout the cortex. Temporal > pooling also makes more output bits active at the same time. So > instead of just 40 cells active out of 64K you might have hundreds. > > > > The CLA was designed to solve the temporal pooling problem. When we > were working on vision problems the temporal pooler was the key thing > we were testing. We have disabled this feature when using the CLA in > a single region because makes the system slower. The temporal pooler > without the “pooling” is still needed for sequence learning. > > > > There is a biological problem with pooling the way we implemented that > I never resolved. So it is a work in progress. > > > > Conclusion: to connect two CLAs together in a hierarchy, all the > cells in the lower region become the input to the next region. But > there are some difficult issues you might need to understand to get > good results depending on the problem. > > Jeff > > > > > > > > From: nupic [mailto:[email protected]] On Behalf Of Tim > Boudreau > Sent: Wednesday, August 28, 2013 4:29 PM > To: NuPIC > Subject: [nupic-dev] Inter-layer plumbing > > > > Is there a general notion of how layers should be wired together, so > that one layer becomes input to the next layer? > > > > It seems like input into one layer is pretty straightforward - in ascii art: > > > > bit bit bit bit bit bit bit bit > > | | | | | > > ------proximal dendrite w/ boost factor---> column > > > > But it's less clear > > - If we have the hierarchy input -> layer 1 -> layer 2, what > constitutes an input bit to layer 2 - the activation of some > combination of columns from layer 1? > > - How information about activation in level 2 should reinforce > connections in layer 1 > > > > Any thoughts? > > > > -Tim > > > > -- > > http://timboudreau.com > > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org > > > > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org > _______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org _______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
