Re: [agi] Introducing Steve's "Theory of Everything" in cognition.

Abram Demski Sun, 28 Dec 2008 12:00:23 -0800

Steve,

This sort of simple solution us what makes me say that relational
learning is where real progress is to be made. That's not to say that
we shouldn't rely on past work in flat learning: a great deal of
progress has been made in that area, boosting them far beyond what
simplistic solutions can do.

Anyway, some comments on your proposal...

The method sounds more like clustering then like principle components.
I suppose it depends on exactly how the lateral inhibition behaves. If
features are allowed to combine linearly, it is PCA, but if lateral
inhibition forces only one neuron to respond to a given input, it is
clustering.

It seems unlikely that an entire visual frame will ever be repeated,
even in dp/dt space. So, I infer that when you say "frame" you are
thinking only of the field of inputs of an individual neuron, which
perhaps correspond to a small region on the retina. Taking the
standard route, the neurons could then be arranged in a hierarchy, so
that more abstract neurons take as input the output of less abstract
ones. But I'm not sure this would go well the way you've described
things. The top level could only recognize whole-scene-classes that
were defined by the intersection of the nonzero elements of all their
members (because each individual neuron will have this property),
which seems very limiting. This could be fixed easily enough, though,
by standard methods.

Anyway, such a hierarchy will not learn any relational concepts :P.
There are ways of getting it to learn *some* relational concepts (for
example, simply the fact  that our eyes are constantly moving will
help tremendously, since moving our eyes to different parts of the
picture is equivalent to one of the suggestions I make in the blog
post I referred you to).

It may be true that all standard PCA methods are "batch mode only",
but there are standard clustering methods that do what you want (one
such method is called "sparse distributed memory").

--Abram

On Sun, Dec 28, 2008 at 5:45 AM, Steve Richfield
<[email protected]> wrote:
> Loosemore, et al,
>
> Just to get this discussion out of esoteric math, here is a REALLY SIMPLE
> way of doing unsupervised learning with dp/dt that looks like it ought to
> work.
>
> Suppose we record each occurrence of the inputs to a neuron, keeping
> counters to identify how many times each combination has happened. For this
> discussion, each input will be considered to have either a substantial
> positive, substantial negative, or nearly zero dp/dt. When we reach a
> threshold, of, say 20, identical occurrences of the same combination of
> dp/dt that is NOT accompanied by lateral inhibition, we will proclaim THAT
> to be our "principal component" function for that neuron to do for the rest
> of its "life". Thereafter, the neuron will require the previously observed
> positive and negative inputs to be as programmed, but will ignore all inputs
> that were nearly zero.
>
> Of course, many frames will be "corrupted" because of overlapping phenomena,
> sampling on a dp/dt edges, noise, fast phenomena, etc., etc. However, there
> will be few if any precise repetitions of corrupted frames, whereas clean
> frames should be quite common.
>
> First the most common "frame" (all zeros - nothing there) will be
> recognized, followed by each of the most common simultaneously occurring
> temporal patterns recognized by successive neurons, all identified in order
> of decreasing frequency exactly as needed for Huffman or PCA coding.
>
> This process won't start until all inputs are accompanied by an indication
> that they have already been programmed by this process, so that programming
> will proceed layer by layer without corruption from inputs being only
> partially developed (a common problem in multi-layer NNs).
>
> While clever math might make this work a little faster, and certainly wet
> neurons can't store many previous patterns, this should be guaranteed to
> work, and produce substantially perfect unsupervised learning, albeit
> probably slower than better-math methods, but probably faster than wet
> neurons that can't save thousands of combinations during early programming.
>
> Of course, this would be completely unworkable outside of dp/dt space, as in
> "object space", this would probably exhaust a computer's memory before
> completing.
>
> Does this get the Loosemore Certificate of No Objection as being an
> apparently workable method for substantially optimal unsupervised learning?
>
> Thanks for considering this.
>
> Steve Richfield
>
>
> ________________________________
> agi | Archives | Modify Your Subscription

-- 
Abram Demski
Public address: [email protected]
Public archive: http://groups.google.com/group/abram-demski
Private address: [email protected]

-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: [agi] Introducing Steve's "Theory of Everything" in cognition.

Reply via email to