Steve Richfield wrote:
Ben, et al,
After ~5 months of delay for theoretical work, here are the basic ideas
as to how really fast and efficient automatic learning could be made
almost trivial. I decided NOT to post the paper (yet), but rather, to
just discuss the some of the underlying ideas in AGI-friendly terms.
Suppose for a moment that a NN or AGI program (they can be easily mapped
from one form to the other
... this is not obvious, to say the least. Mapping involves many
compromises that change the functioning of each type ...
), instead of operating on "objects" (in an
object-oriented sense)
Neither NN nor AGI has any intrinsic relationship to OO.
, instead, operates on the rate-of-changes in the
probabilities of "objects", or dp/dt. Presuming sufficient bandwidth to
generally avoid superstitious coincidences, fast unsupervised learning
then becomes completely trivial, as like objects cause simultaneous
like-patterned changes in the inputs WITHOUT the overlapping effects of
the many other objects typically present in the input (with numerous
minor exceptions).
You have already presumed that something supplies the system with
"objects" that are meaningful. Even before your first mention of dp/dt,
there has to be a mechanism that is so good that it never invents
objects such as:
Object A: "A person who once watched all of Tuesday Welds movies in the
space of one week" or
Object B: "Something that is a combination of Julius Caesar's pinky toe
and a sour grape that Brutus' just spat out" or
Object C: "All of the molecules involved in a swiming gala that happen
to be 17.36 meters from the last drop of water that splashed from the pool".
You have supplied no mechanism that is able to do that, but that
mechanism is 90% of the trouble, if learning is what you are about.
Instead, you waved your hands and said "fast unsupervised learning
> then becomes completely trivial" .... this statement is a declaration
that a good mechanism is available.
You then also talk about "like" objects. But the whole concept of
"like" is extraordinarily troublesome. Are Julius Caesar and Brutus
"like" each other? Seen from our distance, maybe yes, but from the
point of view of Julius C., probably not so much. Is a G-type star
"like" a mirror? I don't know any stellar astrophysicists who would say
so, but then again OF COURSE they are, because they are almost
indistinguishable, because if you hold a mirror up in the right way it
can reflect the sun and the two visual images can be identical.
These questions can be resolved, sure enough, but it is the whole
business of resolving these questions (rather than waving a hand over
them and declaring them to be trivial) that is the point.
But, what would Bayesian equations or NN neuron functionality look like
in dp/dt space? NO DIFFERENCE (math upon request). You could trivially
differentiate the inputs to a vast and complex existing AGI or NN,
integrate the outputs, and it would perform _identically_ (except for
some "little" details discussed below). Of course, while the transforms
would be identical, unsupervised learning would be quite a different
matter, as now the nearly-impossible becomes trivially simple.
For some things (like short-term memory) you NEED an integrated
object-oriented result. Very simple - just integrate the signal. How
about muscle movements? Note that muscle actuation typically causes
acceleration, which doubly integrates the driving signal, which would
require yet another differentiation of a differentiated signal to, when
doubly integrated by the mechanical system, produce movement to the
desired location.
Note that once input values are stored in a matrix for processing, the
baby has already been thrown out with the bathwater. You must START with
differentiated input values and NOT static measured values. THIS is what
the PCA folks have been missing in their century-long quest for an
efficient algorithm to identify principal components, as their arrays
had already discarded exactly what they needed. Of course you could
simply subtract successive samples from one another - at some
considerable risk, since you are now sampling at only half the
Nyquist-required speed to make your AGI/NN run at its intended speed. In
short, if inputs are not being electronically differentiated, then
sampling must proceed at least twice as fast as the NN/AGI cycles.
But - how about the countless lost constants of integration? They "all
come out in the wash" - except for where actual integration at the
outputs is needed. Then, clippers and leaky integrators, techniques
common to electrical engineering, will work fine and produce many of the
same artifacts (like visual extinction) seen in natural systems.
It all sounds SO simple, but I couldn't find any prior work in this
direction using Google. However, the collective memory of this group is
pretty good, so perhaps someone here knows of some prior effort that did
something like this. I would sure like to put SOMETHING in the
"References" section of my paper.
Loosemore: THIS is what I was talking about when I explained that there
is absolutely NO WAY to understand a complex system through direct
observation, except by its useless anomalies. By shifting an entire AGI
or NN to operate on derivatives instead of object values, it works
*almost* (the operative word in this statement) exactly the same as one
working in object-oriented space, only learning is transformed from the
nearly-impossible to the trivially simple. Do YOU see any
observation-based way to tell how we are operating behind our eyeballs,
object-oriented or dp/dt? While there are certainly other explanations
for visual extinction, this is the only one that I know of that is
absolutely impossible to engineer around. No one has (yet) proposed any
value to visual extinction, and it is a real problem for hunters, so if
it were avoidable, then I suspect that ~200 million years of evolution
would have eliminated it long ago.
Read David Marr's book "Vision", or any other text that discusses the
low level work done by the visual system. There are indeed
differentiation functions in there (IIRC, Marr came up with the
Difference of Gaussians (DOG) idea because the difference of Gaussians
was a way to do the equivalent of dp/dt). BUT... this is all in the
first few wires coming out of the retina! It is not interesting.
Visual extinction (of the sort you are talking about) is all over and
done with in the first few cells of the visual pathway, whereas you are
talking here about the millions of other processes that occur higher up.
As for your comment about complex systems, it looks like a nonsequiteur.
Just does not follow, as far as I can see.
Richard Loosemore
From this comes numerous interesting corollaries.
Once the dp/dt signals are in array form, it would become simple to
automatically recognize patterns representing complex phenomena at the
level of the neurons/equations in question. Of course, putting it in
this array form is effectively a transformation from AGI equations to NN
construction, a transformation that has been discussed in prior
postings. In short, if you want your AGI to learn at anything
approaching biological speeds, it appears that you absolutely MUST
transform your AGI structure to a NN-like representation, regardless of
the structure of the processor on which it runs.
Unless I am missing something really important here, this should
COMPLETELY transform the AGI field, regardless of the particular
approach taken.
Any thoughts?
Steve Richfield
------------------------------------------------------------------------
*agi* | Archives <https://www.listbox.com/member/archive/303/=now>
<https://www.listbox.com/member/archive/rss/303/> | Modify
<https://www.listbox.com/member/?&>
Your Subscription [Powered by Listbox] <http://www.listbox.com>
-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription:
https://www.listbox.com/member/?member_id=8660244&id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com