Steve Richfield wrote:
Ben, et al,
After ~5 months of delay for theoretical work, here are the basic ideas as to how really fast and efficient automatic learning could be made almost trivial. I decided NOT to post the paper (yet), but rather, to just discuss the some of the underlying ideas in AGI-friendly terms. Suppose for a moment that a NN or AGI program (they can be easily mapped from one form to the other

... this is not obvious, to say the least. Mapping involves many compromises that change the functioning of each type ...

), instead of operating on "objects" (in an
object-oriented sense)

Neither NN nor AGI has any intrinsic relationship to OO.

, instead, operates on the rate-of-changes in the
probabilities of "objects", or dp/dt. Presuming sufficient bandwidth to generally avoid superstitious coincidences, fast unsupervised learning then becomes completely trivial, as like objects cause simultaneous like-patterned changes in the inputs WITHOUT the overlapping effects of the many other objects typically present in the input (with numerous minor exceptions).

You have already presumed that something supplies the system with "objects" that are meaningful. Even before your first mention of dp/dt, there has to be a mechanism that is so good that it never invents objects such as:

Object A: "A person who once watched all of Tuesday Welds movies in the space of one week" or

Object B: "Something that is a combination of Julius Caesar's pinky toe and a sour grape that Brutus' just spat out" or

Object C: "All of the molecules involved in a swiming gala that happen to be 17.36 meters from the last drop of water that splashed from the pool".

You have supplied no mechanism that is able to do that, but that mechanism is 90% of the trouble, if learning is what you are about.

Instead, you waved your hands and said "fast unsupervised learning
> then becomes completely trivial" .... this statement is a declaration that a good mechanism is available.

You then also talk about "like" objects. But the whole concept of "like" is extraordinarily troublesome. Are Julius Caesar and Brutus "like" each other? Seen from our distance, maybe yes, but from the point of view of Julius C., probably not so much. Is a G-type star "like" a mirror? I don't know any stellar astrophysicists who would say so, but then again OF COURSE they are, because they are almost indistinguishable, because if you hold a mirror up in the right way it can reflect the sun and the two visual images can be identical.

These questions can be resolved, sure enough, but it is the whole business of resolving these questions (rather than waving a hand over them and declaring them to be trivial) that is the point.

But, what would Bayesian equations or NN neuron functionality look like in dp/dt space? NO DIFFERENCE (math upon request). You could trivially differentiate the inputs to a vast and complex existing AGI or NN, integrate the outputs, and it would perform _identically_ (except for some "little" details discussed below). Of course, while the transforms would be identical, unsupervised learning would be quite a different matter, as now the nearly-impossible becomes trivially simple. For some things (like short-term memory) you NEED an integrated object-oriented result. Very simple - just integrate the signal. How about muscle movements? Note that muscle actuation typically causes acceleration, which doubly integrates the driving signal, which would require yet another differentiation of a differentiated signal to, when doubly integrated by the mechanical system, produce movement to the desired location. Note that once input values are stored in a matrix for processing, the baby has already been thrown out with the bathwater. You must START with differentiated input values and NOT static measured values. THIS is what the PCA folks have been missing in their century-long quest for an efficient algorithm to identify principal components, as their arrays had already discarded exactly what they needed. Of course you could simply subtract successive samples from one another - at some considerable risk, since you are now sampling at only half the Nyquist-required speed to make your AGI/NN run at its intended speed. In short, if inputs are not being electronically differentiated, then sampling must proceed at least twice as fast as the NN/AGI cycles. But - how about the countless lost constants of integration? They "all come out in the wash" - except for where actual integration at the outputs is needed. Then, clippers and leaky integrators, techniques common to electrical engineering, will work fine and produce many of the same artifacts (like visual extinction) seen in natural systems. It all sounds SO simple, but I couldn't find any prior work in this direction using Google. However, the collective memory of this group is pretty good, so perhaps someone here knows of some prior effort that did something like this. I would sure like to put SOMETHING in the "References" section of my paper. Loosemore: THIS is what I was talking about when I explained that there is absolutely NO WAY to understand a complex system through direct observation, except by its useless anomalies. By shifting an entire AGI or NN to operate on derivatives instead of object values, it works *almost* (the operative word in this statement) exactly the same as one working in object-oriented space, only learning is transformed from the nearly-impossible to the trivially simple. Do YOU see any observation-based way to tell how we are operating behind our eyeballs, object-oriented or dp/dt? While there are certainly other explanations for visual extinction, this is the only one that I know of that is absolutely impossible to engineer around. No one has (yet) proposed any value to visual extinction, and it is a real problem for hunters, so if it were avoidable, then I suspect that ~200 million years of evolution would have eliminated it long ago.

Read David Marr's book "Vision", or any other text that discusses the low level work done by the visual system. There are indeed differentiation functions in there (IIRC, Marr came up with the Difference of Gaussians (DOG) idea because the difference of Gaussians was a way to do the equivalent of dp/dt). BUT... this is all in the first few wires coming out of the retina! It is not interesting. Visual extinction (of the sort you are talking about) is all over and done with in the first few cells of the visual pathway, whereas you are talking here about the millions of other processes that occur higher up.

As for your comment about complex systems, it looks like a nonsequiteur. Just does not follow, as far as I can see.



Richard Loosemore



 From this comes numerous interesting corollaries.
Once the dp/dt signals are in array form, it would become simple to automatically recognize patterns representing complex phenomena at the level of the neurons/equations in question. Of course, putting it in this array form is effectively a transformation from AGI equations to NN construction, a transformation that has been discussed in prior postings. In short, if you want your AGI to learn at anything approaching biological speeds, it appears that you absolutely MUST transform your AGI structure to a NN-like representation, regardless of the structure of the processor on which it runs. Unless I am missing something really important here, this should COMPLETELY transform the AGI field, regardless of the particular approach taken. Any thoughts? Steve Richfield ------------------------------------------------------------------------ *agi* | Archives <https://www.listbox.com/member/archive/303/=now> <https://www.listbox.com/member/archive/rss/303/> | Modify <https://www.listbox.com/member/?&;> Your Subscription [Powered by Listbox] <http://www.listbox.com>




-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Reply via email to