The problem is cells that are pooling over time must be active/spiking, not 
just depolarized as in sequence learning.  When cells become active by pooling 
in advance of feed forward activation, it messes up the sequence memory.  The 
CLA can’t tell the difference between activation because of a real world feed 
forward input and activation because of pooling.  What happens is the CLA 
doesn’t wait for real input and sequences runaway forward in time.

 

Are we talking about within one layer, or across layers here?

 

>> Within one layer.  The CLA models just one layer of cells.

 

I don't know that I can step through enough of what happens in my head yet to 
see how that happens in a single layer.  But what does look like a hole is:  
Say Cell A is active due to input, and Cell B is predictively activated, not 
because the immediate next input will activate it, but because the input *after 
that* will;  there is no way to differentiate a prediction for further in the 
future than the next step from a wrong prediction about the next step.  Is that 
the issue, or am I just off in outer space?

 

>> Sorry, I couldn’t follow your question. 

It might be useful to step back as remind ourselves what problems we are trying 
to solve.  Within each region of the cortex we want to do the following.

 

1) Inference of sequences.

We know the cortex recognizes sequences of patterns.  All audition (e.g. spoken 
language, music) is only recognized when played in the correct order.  The same 
is true for vision and touch, they are temporal inference problems.  (We can 
recognize some still images but vision is mostly temporal.)

 

2) Prediction while inferring.

We know that the cortex is constantly predicting what is going to happen next.  
We know this because we recognize when things change.  It appears we make 
multiple predictions simultaneously.  The properties of SDRs solve the multiple 
prediction problem beautifully.  The cortex needs to detect anomalies so it can 
direct attention to unpredicted input.

 

3) Temporal pooling.

This is when cells learn to fire continuously during part or all of a sequence. 
 We know there are cells that do this for multiple phenomena in vision and 
audition  Temporal pooling is not as widely understood, but many people have 
reached the same hypothesis that temporal pooling plays a large part of how we 
build invariant representations.  We don’t know if temporal pooling is 
occurring everywhere in the cortex but I work on the assumption that it does.  
Temporal pooling makes the hierarchy work much better too.  It means each level 
in the hierarchy can learn as much spatial and temporal context as it can, 
freeing up the next level to work in more complex patterns.  If we didn’t have 
temporal pooling then the output of layer 3 would be changing at the same rate 
as the input, we need to get to slower more stable concepts as we ascend the 
hierarchy.

 

The CLA is a great model of how a layer of cells does 1 and 2. It only requires 
that the active cells are spiking and the predictive cells are depolarized. I 
am confident the CLA is close to how real cells do inference and prediction.

 

Things get tricky when we try to add temporal pooling, to get a layer of cells 
to do 1, 2, and 3.  My assumption is that a single layer, layer 3 at a minimum, 
has to do inference, prediction, and temporal pooling.  This could be an 
incorrect assumption.  Some scientists divide layer 3 into 3a and 3b.  Some 
scientists designate a separate layer 2 and some don’t.  Perhaps these sub 
layers are doing different things, some inference and then others pooling.  So 
when discussing temporal pooling we need to keep in mind that the attempt to do 
it all in one layer of cells might be wrong.  It is speculative.  I think it 
would be simpler and more elegant if we can show that a single layer cells can 
infer, predict, and do temporal pooling.  That is what I am trying to do, but 
that might be wrong.

 

So how does the “proposal” for temporal pooling work?  Say we have a sequence 
A-B-C-D-E that repeats.  When E cells first become active they form synapses to 
cells that were just active during D. So now the E cells will become 
predictively active when they see the D cells again.  However, when the E cells 
became predictively active due to the D cells pattern C was just active so E 
cells now form synapses (on a different segment) with C cells.  After this the 
E cells become predictively active when they see either C or D cells. The 
process can repeat.  At the end, a particular E cell has a dendrite segment 
that recognizes D, another dendrite segment that recognizes C, another segment 
that recognizes B, etc.  So now the E cells will be predictively active when 
they see A, B, C, or D.  We would see the E cells fire continuously during the 
ABCDE sequence, but only at E is the cell responding to its feed forward 
receptive field.

 

As I said earlier, this all looks good except that it requires us to change the 
“predictive state” from being depolarized to steady firing, and the active 
state from being steady firing to mini-bursting.  That is a relatively fine 
distinction that makes me uncomfortable.  There are other weirdnesses that I 
haven’t resolved.  For example this would say the mini-burst occurs at the end 
of a sequence, where there is more evidence they occur at the beginning.

 

A TOTALLY DIFFERENT APPROACH

 

One time I was talking to Murray Sherman and he suggested a completely 
different approach to temporal pooling.  It is much much simpler but dumber 
(not Murray, he is smart).

 

Excitatory synapses can be divided into ionotropic and metabotropic.  The 
former only involve the flow of ions across the cell membrane.  They are quick 
to start and quick to stop.  The latter invoke a metabolic pathway (chemistry, 
proteins, stuff like that).  They are slower.  A metabotropic synapse will 
depolarize a cell for up to a third or half of a second for one incoming spike, 
whereas the effect of a spike at an ionotropic synapse lasts just a few 
milliseconds.  The synapses near the cell body (the inputs to a region, the SP 
synapses) are the slow type.  The synapses on the distal dendrites (our TP 
synapses) are the fast type.  What this means is that a fast changing input to 
layer 3 will result in a slower changing response.  This will definitely lead 
to slower and slower responses as you ascend the hierarchy.

 

This approach is dumb because it involves no learning. It pools anything and 
everything that occurs even one in sequence.  It is also hard to see how we can 
get cells that stay active for several seconds during sequences.

 

CONCLUSION

Because the temporal pooling question remains unresolved I prefer to ignore it 
for now.  We chose to work on problems of prediction and anomaly detection in a 
single region that don’t require temporal pooling.  But I don’t want to 
discourage others from working on it.

 

 

 

 

That actually brings up a question I was wondering about:  When we talk about 
making predictions several steps in advance, how is that actually done - by 
snapshotting the state, then hypothetically feeding the system's predictions 
into itself to get further predictions and then restoring the state;  or is it 
simply that you end up with cells which will only be predictively active if 
several future iterations of input are likely to *all* activate them?

 

>> When NuPIC makes predictions multiple steps in advance, we use a separate 
>> classifier, external to the CLA. We store the state of the CLA in a buffer 
>> for N steps and then classify those states when the correct data point 
>> arrives N steps later.

Jeff

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Reply via email to