[nupic-discuss] Principles of neural attention: The mirroring mechanism

Bert Frederiks Mon, 17 Feb 2014 15:13:18 -0800

To introduce myself... In a "previous live" I wrote a book (not reallyfinished but "dumped" on the Internet in 2001): "The Time Machine,Prototype of a Conscious Machine". Like Jeff Hawkins I read theuniversity library while actually studying something else, late 1980s,never graduating.

Last week I was attended to this great project at Nupic through beingsent your white paper. I could not agree more with your approach and thethoughts of Jeff Hawkins. Especially thank you very much for yourpositive attitude, Jeff.

In this mail I want to present my idea of how you could implementhierarchy and attention in such a way that the neural machinery becomesa machine that can imagine things, especially temporal things, that istemporal images. I think one should call these temporal imagesconsciousness, though not self-consciousness. Remark that this systemwill also not be able to speak yet, no matter how big and fast you make it.

The text is edited cut-and-paste from earlier work trying to fit it toNupic. Where one reads "us" or "humans" one should read "Nupic's HTM".There is much redundancy in the text. I could make a summary forprogrammers or such like. Just ask. It would be very nice if this getsimplemented.

(Please do not try to read my book; it is too badly written; better askme to write on something. If you cannot resist, skip chapters 5, 6, and 7.)


I attached two figures to help explain this text:

1 (img58.gif): A hierarchical neural network split in two halves

2 (img65.gif): Intra- and inter-cortical projections in our brains. II,III, and IV

refer to layers II, III, and IV of our cerebrum. Straight lines are
intra-cortical projections. Bend lines are inter-cortical projections.


Take two pattern associators. Imagine them to have an input side at one
end, and an output side at the other end. Lay them on top of each other,
but in opposite directions with regard to each others input and output
side. Then reconnect them such, that each one of them tends to mirror,
mimic, or ‘imitate,’ the other, but in the opposite direction with regard
to each other (see attached figure).

To the input of the first neural network are attached sensory devices
like eyes and ears. This neural network I call the deconstructing neural
network. The input from the second network is derived from the attention
mechanism. This neural network I call the constructing neural network.

At the split in this network a process which I call mirroring takes
place. It is essential that the neural activation in each halve of the
split neural network is flowing in the opposite direction.

In the split neural network, taken as a whole, I take the sensory end to
be the bottom, and the other end, that is the side of the attention
mechanism, to be the top of the neural network.

Neural activation flowing from the senses to the attention mechanism can
also be referred to as forward projection, while downward activation from
the attention mechanism to the senses is often referred to as backward
projection. This terminology is common in medical, neuro-physiological
literature.

Imagine that the attention mechanism does no more than to find a small
number of most active neurons in the top of the deconstructing neural
network, and then it activates some neurons at the same place, but in the
constructing network, while depressing all other neurons at this highest
level. As a result there are, from moment to moment, only a few neurons
significantly more active at the highest neural level of both the
constructing neural network and the deconstructing neural network, with
still many neurons being activated in the lower neural levels.

Since the deconstructing neural network is mirroring the constructing
network, it will tend to look the same and vice versa. But the neural
activation in each network is flowing in the opposite direction.
This is the principle. One of the consequences is
that we can attend to something through the constructing neural network,
while something can attract its attention through the deconstructing
neural network.


I cannot say for sure what the best mechanism for mirroring between
neural networks is. I can think of a number of possibilities.

My guess is that in our own brains the networks selectively and
temporally sensitize or desensitize each other, thereby setting out a
preferred route for activation within each network. By sensitizing I do,
in this context, mean to temporarily facilitate the activation of a
certain neuron. (In Nupic: create a predictive state). This may, of
course, indirectly lead to a more permanent sensitivity of the neuron
with regard to other neurons.

One difference with ordinary neural activation is that mirroring
connections have no memory. Their inter-neural strength, or weight, or
sensitivity, is fixed, except, I guess, in a global sense, for example
with regard to emotional arousal. Above all their weight is relatively
small, so that it won’t influence our perception more than necessary.

Although the sensitivity between the two neurons in the two halves of the
split neural network is itself fixed, their sensitivity with regard to
neurons within their own halve of the neural network, of course, is not.
So, if a neuron in one network is temporarily sensitized by the other
network, then this neuron will become an easier route for neural
activation within the first network. As a result chances are that
relatively more activation will flow through this neuron. Thereby its
sensitivity toward neurons leading to this increased activation will also
increase. The net result of both perception and learning is mirroring.

One can view mirroring as generating error-signals, the error signal
being the difference between the actual and the desired mirroring state.
This is a biologically possible implementation of error back-propagation.

One could also think of a direct influence of the error signal, having it
really activate or deactivate the other neural network a little, instead
of only facilitating activation. To me this seems a bit counter
intuitive, but the effect will probably not be very different,
considering that there is always a lot of neural noise in our brains.
Thereby this has (probably) been proven to work a long time ago by
Stephen Grossberg (Adaptive Resonance Theory).

In our brains the amount of mirroring is possibly flexible and
controllable. The more difficulties the attention mechanism has with
attending to something, the more the deconstructing neural network should
mirror the constructing neural network in order to help the attention
mechanism to gain a hold—with all its complicating consequences regarding
the trustworthiness of our perception.

The deconstructing and the constructing neural network are each others
backward propagating networks. What is propagated backward is not an
error signal but still, the difference between the two networks may lead
to an error signal, which in turn leads to the mirroring of the two
networks.

Simplifying the working of this neural machinery, we can view the
deconstructing neural network as a pattern associator, with at one end
the sensory input from our eyes and ears, and at the other, ‘output’ end
the attention mechanism. In this view the attention mechanism provides
the teaching input for this neural network. The particularity here is,
that this teaching input does not come from outside the thinking
apparatus, neither is there a homunculus. The attention mechanism causes
the ultimate competition between output neurons by letting only one, or a
few of them, win—in case of human vision, about five may win.

In the other direction the constructing neural network has the attention
mechanism at its input, and the sensory devices at its output. As such
the sensory input is the teaching input of the constructing neural
network. It is more or less taught to reproduce the perceived outside
world within itself.

This attentive neural network is split in such a way that it is not only
a pattern associator, but also an auto-associator. It can imagine, and
even hallucinate things—although the latter is not what we want of
course.

Essential to this design is, amongst others, that, in the constructing
neural network, the activation pattern is remembered as if it came into
being only by the activation of the attention mechanism—which is, both
historically and actually, not true because of the mirroring mechanism.

In the deconstructing neural network, the activation pattern is
remembered as if it came into being only by activation through the
senses—which is not true either.

A consequence of the former is that the network can reproduce an image,
without the aid of the deconstructing network, by the attention mechanism
somehow, ‘willingly’ attending to it.

If this happens, then we, through the deconstructing network, will tend
to attend to something which looks, sounds, feels, tastes, or smells like
something in the outside world, or, if it is not there, it will lead to
imaginations—or even hallucinations. Just think of ice cream when it’s
hot, and you know what I mean.

The other way a consequence is also that our senses can, through the
deconstructing neural network, make us ‘unwillingly’ attend to something,
without the aid of the constructing neural network.

The images I am speaking about in this context are mostly not
conscious-kind-off images. Personally I like to say that we can
experience these images, but then I define “to experience” as “to know
that something is absent”. If you know that something is absent, then you
know what is missing. As such it is present again, and you can use
your attention to become conscious of something.

The easier it is for the attention mechanism to attend, the stronger the
mirroring can be, and the better the thing attended to can be remembered.
In plain English: the things you know best are remembered best. This
could make that you may fail to see small changes. On the other hand, if
there is some kind of feed-back mechanism at work that tries to keep the
difficulty of the attention mechanism attending to something constant,
then one would, to the contrast, see more peculiarities and differences
with regard to things one knows best.


Let us make my attention mechanism such, that it is able to attend to
more than one thing at a time. If we take human vision as an example, the
attention mechanism should be able to attend to five things at a time.

Five each other mutually excluding attention mechanisms can together make
the neural network attend to five things at a time. The effect will be
that it finds the five most active neurons and then excites their mirror
neurons in the constructing neural network more than they might already
be activated.

As such there is a pattern of attention. This attention pattern is
evidently far less complex than the sensory pattern with which it is
associated. As such, there is not only a complex, sensory pattern at the
bottom of the network—plus the distributed representation following
this—but also a far more simple attention pattern. These two are bound to
each other, though not in as simple a way as would have been the case
with singular attention.

Such an attention pattern can be remembered and retrieved consciously and
almost instantly. The attention pattern can be stored and retrieved much
faster, and more permanently than sensory activation patterns. After all,
we do not have a truly photographic memory. This remembering leads to us
subjectively associating things with each other in a classical sense.

I think the remembering of the attention pattern (in other words: the
things attended to) also leads to "chunking" this pattern into a single
unit that can become part of the next attention pattern. This chunking
probably is a natural consequence of learning in the mirroring process.

Human beings can remember the neurologically rather simple attention
pattern in a matter of seconds. Whatever set of visual elements, if we
can already recognize each element in it, and if the set contains less
than, or equal to, five elements, then the set as a whole can be
remembered almost instantly. This said we do have difficulty with more
than three elements.

In non-human animals the same mechanism holds, but in them this kind of
fast learning is, I think, largely dependent on stimuli like food and
punishment. We, humans, can partly ‘act on’ our attention mechanism
ourselves through other means not described here yet.

The elements of the attention pattern may refer to prototypical instances
of things, but they may also have their own identity, or something in
between. This is like the difference in language between nouns and names.
They may also entail abstract notions like "in-or-out-ness", or whatever
a neural network can distinguish.

Disregarding narrative abilities, which we have trough the use of
language, our instant, long term memory of concrete events must be
largely the consequence of this remembering of the attention pattern. It
is not easy to remember something which we know little about. We must
first train the deconstructing and the constructing neural network, in
order to be able to remember something easily, or even to perceive
something consciously.


Subjectively the attention mechanism is also an association mechanism.

Since the attention mechanism can attend to five things at a time, we may
conceive of attentional hierarchies as consisting of five intermingled
hierarchies. This has very interesting consequences. It allows for
new, more ore less sensible discoveries to happen within our brains.
This, for instance, is a prerequisite for language: The intermingling of
attentional activation hierarchies allows for the possibility of a
mixture of words (read: a (part of a) sentence) to gain a new meaning
which goes beyond each of the words. Think of metaphors.

Where attentional stimulation on the perceptive side of a neural network
leads to imagination or hallucination, on the motor side this usually
immediately leads to bodily activity.



==Physiology

I attached a drawing of how all this is to be place in the human
cerebrum. If we divide the cortex into the six well known cortical layers
and draw the cortical projections between some of them, than we roughly
arrive at something like the attached Figure. Only layer II, III, and IV
are named here, and of course the hierarchy involves more steps, and
contains many more branches. I think that each of the 100 or so cortical
areas is a distinct piece in a neural level, having its place somewhere
in the hierarchical tree of the larger neural network. In this picture
straight lines are projections within a neural level. Bend lines are
projections between neural levels. Deconstructing neural activation flows
from left to right. Constructing activation flows from right to left. We
see that both of these activation streams connect to the other at a
certain interval. We see that the lessons, which the deconstructing and
the constructing neural network teach each other, must take place
somewhere in or between cortical layer II and cortical layer III.

For a complex of reasons I think that, attention initially stems from the
hippocampus and associated structures, and therefore I guess that the
difference between reptiles and mammals is largely that the latter has
lower-level neural networks that are capable of learning, being the
cerebral cortex, and the former has not. This physiology makes it likely
that emotional arousal and instincts can interfere rather directly with
the attention-mechanism and gain lots of control on our seemingly
voluntary actions. Except for sexuality and aggression this is especially
noticeable in non-humans. It is in accordance with some famous cases of
human brain-damage too—I think of “H.M.”—if you take into account that,
if you lose lots of the attention-mechanism at a later age, lower-level
neural groups will by then have been trained to take over much of its
function. So, in effect you ‘only’ lose your temporal, ‘declarative,’ or
‘epic’ memory with regard to new things—that is anterograde amnesia.

Next to the attention mechanism mentioned above there are many other
mechanisms which deserve the name ‘attention mechanism’ equally well.
Take for instance the many mechanisms that make us physically direct our
head and eyes toward something. I further more think that there are some
inborn mechanisms which make that we are mentally adapted to a
three-dimensional world at birth. Although we can mentally focus at
something with the attention mechanism as described above, we can
probably do this in other ways too. For this there is, amongst others, in
the older parts of our brains, a kind of two- or three-dimensional neural
focusing mechanism, which helps us see a three dimensional world [Stephen
M. Kosslyn: 1994].



==Connecting the mirroring network to our muscles.

Motor movement is so very much integrated with perception, that imagining
motor action on the the sensory side of neural network helps to
coordinate this motor action much. The integration of motor and sensory
neural networks also works the other way round. For instance, as a
violin-player, when I want to read music, it helps me a lot to move my
bowing arm as if playing.

The simple—or over-simplified—idea is this. On the perceptive side of a
neural network we connect our senses to the constructing neural network,
while the deconstructing neural network is not connected to anything—but
it is needed for mirroring. On the motor side it is
exactly the other way round. Here we connect, so to say, our muscles to
the constructing neural network, while the deconstructing neural network
is (probably) not connected to anything.

To understand how our cerebrum is connected to our arms and legs it is
important to understand that the basic tasks of movement are accomplished
by phylogenetically older, ‘lower’ level, ‘reptile’ neural structures.

This control starts with simple reflexes in the spinal cord.

In the brain stem we next find the medulla oblongata, the pons varolii,
and the mid-brain or mesencephalon. In this area is also located the
reticular formation, which has an important influence on consciousness
and arousal. Further more, most sensory information goes through these
structures too.

The medulla consists of many nuclei. They take care of simple antagonist
movements, such as moving one joint of an arm up and down. Other nuclei
regulate things like our heartbeat, breathing, the dilatation of our
blood vessels, sneezing, hiccuping, coughing, vomiting, etcetera.

One step higher the pons varolii regulates swallowing, chewing, some
eyeball-movements, taste, facial expression, salivation, and more.

Above this the mid-brain regulates the movement of our eyes and head based
on visual or auditory information.

Many nuclei in the brain stem which regulate motor actions receive
sensory information. At the lowest level this is mainly information which
relates directly to the muscles which are regulated, such as muscle
tension. In the mid-brain information from our eyes and ears is taken into
account too.

In the middle of our brains we find the thalamus. The thalamus is the
principal relay station of our brains. Certainly for sensory impulses it
also does a lot of interpretation. All these sensory impulses reach the
thalamus through all lower brain structures mentioned earlier.
Interpretation and ‘conscious’ recognition of pain and temperature is
though to take place in the thalamus. It probably also contains a kind of
focusing mechanism for our vision.

Again higher we find the basal ganglia, also named the cerebral nuclei.
It consists mainly of the corpus striatum. The striatum itself is divided
into the lentiform nucleus and the caudate nucleus. The lentiform nucleus
is again subdivided into the corpus putamen, and the globus pallidus.
Lateral to the putamen we find the claustrum. All these neural formations
are heavily interconnected with each other, with the thalamus, and with
the cerebral cortex. The basal ganglia control large unconscious
movements, like swinging arms while walking, and also walking itself.
With many animals one might not immediately notice it if they would not
have a cerebral cortex! In birds their cerebral cortex is hardly
developed. Voluntary motor functions in birds stem from their well
developed basal ganglia. Next to this the basal ganglia—probably the
pallidus—also regulate muscle tone—if one plans to do something, then the
right muscles need to be stimulated a little before actual use.

At the tails of the caudate nuclei we find the amygdaloid nuclei. The
hippocampus, amygdaloid nuclei, and parts of the thalamus and
hypothalamus together form the limbic system. The limbic system regulates
emotion, survival, and our memory. It is responsible for all kinds of
involuntary movement, and complex emotional behavior such as rage. The
amygdaloid nuclei play an important role in aggression and sexuality. My
personal guess is that the hippocampus plays an important role with
regard to the attention mechanism.

The cerebellum is connected to each of the three parts of the brain stem
as mentioned above. It is also rather directly coupled to our cerebrum.
The cerebellum takes care of what is sometimes named ‘subconscious’
movements of skeletal muscles. Important are coordination, maintenance of
posture, and balance. Important herein is proprioception. That is our
sense of position of our body parts relative to each other. From this the
cerebellum makes decisions on muscle contractions with regard to desired
movement. The result is smooth, coordinated movement. With regard to
balancing our body there are direct connections from the balancing-organ
in our inner ear to our cerebellum

From the above we may conclude that the neural network of the cerebrum,
to which I try to confine myself, has, in daily practice, little to do
with how our muscles do something with our body. We even see that the
phylogenetically older parts of our brain can (not surprisingly, since
reptiles have no more) more or less let us live on their own, and that
the cerebrum is just one more, regulating layer. As regulating layer,
however, it can instruct, and with regard to the cerebellum also teach,
lower level neural structures, i.e. we can train ourselves. Teaching is
probably very limited. The cerebellum takes care of precise adjustment of
movements. I don’t think our basal ganglia have learning abilities. Maybe
it is different in birds? In essence, all that these lower brain
structures do is built-in from birth. There may be crude forms of
adaptation, but nothing which I have to take into account here. What is
important to me is that this apparatus can already do so very much
automatically.

--
Regards,
Bert Frederiks

<<attachment: img58.gif>>

<<attachment: img65.gif>>

<<attachment: BertF.vcf>>

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

[nupic-discuss] Principles of neural attention: The mirroring mechanism

Reply via email to