HOW VALUABLE IS SOLMONONOFF INDUCTION FOR REAL WORLD AGI?

Through some AGIRI links I stumbled on the Shane Legg's "Friendly AI is
bunk" page.  I was impressed.   I read more of his related web pages and
became very impressed.

Then I read his paper "Solmononoff Induction."
(http://www.vetta.org/documents/disSol.pdf )

It left me confused.  I don't have a sufficient math background to
understand all its notation, so I know I may well be wrong (I mean totally
wrong), but to me Solmononoff Induction, as described in the paper seemed
to be missing something big, at least if it were to be used for a general
purpose AGI that was to learn from perception in the real world.

I understand the basic idea that if you are seeking a prior likelihood for
the occurrence of an event and you have no data about the frequency of its
occurrence -- absent any other knowledge -- some notion of the complexity,
in information-theory terms, of the event might help you make a better
guess.  This makes sense because reality is a big computer, and complexity
-- in terms of the number of combined events required to make reality
cough up a given event , and the complexity of the space in which those
events are to be combined -- should to some extent be related to the
event's probability.  I can understand how such complexity could be
approximated by the length of code required in some a theoretical
Universal computer to model such real world event-occurrence complexity.

But it seems to me there are factors other than the complexity of
representing or computing a match against a hypothesis that --  in the an
AGI sensing and acting in a real world --  might be much more important
and functional for estimating probabilities.

For example, although humans are not good at accurately estimating certain
types of probabilities, we have an innate ability to guess probabilities
in a context sensitive way and by using knowledge about similar yet
different things.  (As shown in the example quoted below from the Kemp
paper below.)  It would seem to me that the complexity of the computation
required to understand what is the appropriate context is not, itself,
necessarily related to the complexity of an event occurring in that
context once the context has been determined.

Similarly, it does not seem to me that the complexity of determining what
is similar to what, in which ways, for purpose of determining the extent
to which probabilities from something similar might provide an appropriate
prior for something never-before-seen, are not necessarily related to the
probability of the thing never-before-seen occurring, itself.  For
example, the complexity of perception is not directly related to the
complexity of occurrence.  For example, complexity of perception can be
greatly affected by changes in light, shape, and view which might not have
anything to do with the probability or complexity of occurrence.

So what I am saying is, for example, that if you are receiving a sequence
of bytes from a video camera, much of the complexity in the input stream
might not be related to complexity-of-event-creation- or
Occam's-razor-type issues, but rather to complexity of perception, or
similarity understanding, or of appropriate context selection, factors
which are not themselves necessarily related to complexity of occurrence.

Furthermore, I am saying that for an AGI it seems to me it would make much
more sense to attempt to derive priors from notions of similarity, of
probabilities of similar things, events, and contexts, and from things
like causal models for similar or generalized classes.  There is usually
much from reality that we do know that we can, and do, use when learning
about things we don't know.

A very good paper on this subject is one by Charles Kemp et al., "Learning
overhypotheses with hierarchical Bayesian models" at
http://www.mit.edu/~perfors/KempPTDevSci.pdf .  It give a very good
example of the power of this type of reasoning -- power that it appears to
me Solomonoff Induction totally lacks

                "participants were asked to imagine that they were
exploring an island in the Southeastern Pacific, that they had encountered
a single member of the Barratos tribe, and that this tribesman was brown
and obese. Based on this single example, participants concluded that most
Barratos were brown, but gave a much lower estimate of the proportion of
obese Barratos (Figure 4). When asked to justify their responses,
participants often said that tribespeople were homogeneous with respect to
color but heterogeneous with respect to body weight (Nisbett et al.,
1983)."

Perhaps I am totally missing something, which is very possible, and if so,
I would like to have it pointed out to me, but I think the type of
overhypothesis reasoning described in this Kemp paper is a much more
powerful and useful source for deriving priors to be used in Bayesian
reasoning in AGIs that are expected to learn in the "real world" than
Solomonoff Induction.

Since I expect Shane Legg is much more knowledgable than I am on the
subject, I am expecting to be told that I am really off track, either by
him or someone else on this list.

 If so, please inform me how.

Ed Porter
24 String Bridge S12
Exeter, NH 03833
(617) 494-1722
Fax (617) 494-1822
[EMAIL PROTECTED]

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=62743443-5258c6

Reply via email to