Hello Edward,

I'm glad you found some of the writing and links interesting.  Let me try to
answer some of your questions.

> I understand the basic idea that if you are seeking a prior likelihood for
> the occurrence of an event and you have no data about the frequency of its
> occurrence -- absent any other knowledge -- some notion of the complexity,
> in information-theory terms, of the event might help you make a better
> guess.  This makes sense because reality is a big computer, and complexity
> -- in terms of the number of combined events required to make reality cough
> up a given event , and the complexity of the space in which those events are
> to be combined -- should to some extent be related to the event's
> probability.  I can understand how such complexity could be approximated by
> the length of code required in some a theoretical Universal computer to
> model such real world event-occurrence complexity.
>
This seems like a reasonable summary to me.

Ok, let's take your example as I think it captures the essence of what
you are getting at:

> So what I am saying is, for example, that if you are receiving a sequence
> of bytes from a video camera, much of the complexity in the input stream
> might not be related to complexity-of-event-creation- or Occam's-razor-type
> issues, but rather to complexity of perception, or similarity understanding,
> or of appropriate context selection, factors which are not themselves
> necessarily related to complexity of occurrence.
>
In short, yes.  So, for example, if you hooked up a Solomonoff induction
machine to
a video camera it would first need to, in some sense, "understand" the
nature of this
input stream.  This may be more complex than what the camera is actually
looking at!

Given that the Solomonoff machine starts from zero knowledge of the world,
other than
a special form of prior knowledge provided by the universal prior, there is
no way around
this problem.  Somehow it has to learn this stuff.  The good news, as
Solomonoff proved,
is that if the encoding of the video input stream isn't too crazy complex (
i.e. there exists
a good algorithm to process the stream that isn't too long), then the
Solomonoff machine
will very quickly work out how to understand the video stream.  Furthermore,
if we put
such a camera on a robot or something wandering around in the world, then it
would not
take long at all before the complexity of the observed world far surpassed
the complexity
of the video stream encoding.

Perhaps what you should consider is a Solomonoff machine that has been
pre-trained
to do, say, vision.  That is, you get the machine and train it up on some
simple vision
input so that it understands the nature of this input.  Only then do you
look at how well
it performs at finding structure in the world though its visual input
stream.

Furthermore, I am saying that for an AGI it seems to me it would make much
> more sense to attempt to derive priors from notions of similarity, of
> probabilities of similar things, events, and contexts, and from things like
> causal models for similar or generalized classes.  There is usually much
> from reality that we do know that we can, and do, use when learning about
> things we don't know.
>
Yes.  In essence what you seem to be saying is that our prior knowledge of
the
world strongly biases how we interpret new information.  So, to use your
example,
we all know that people living on a small isolated island are probably
genetically quite
similar to each other.  Thus, if we see that one has brown skin, we will
guess that the
others probably do also.  However, weight is not so closely tied to
genetics, and so if
one is obese then this does not tell us much about how much other islanders
weigh.

Out-of-the-box a Solomonoff machine doesn't know anything about genetics and
weight, so it can't make such inferences based on seeing just one islander.
However,
if it did have prior experience with genetics etc., then it too would
generalise as you
describe using context.  Perhaps the best place to understand the theory of
this
is section 8.3 from "An Introduction to Kolmogorov complexity and its
Applications"
by Li and Vitanyi.  You can also find some approximations to this theory
that have
been applied in practice to many diverse problems under the title of
"Normalized
Compression Distance" or NCD.  A lot of this work has been done by Rudi
Cilibrasi.

 HOW VALUABLE IS SOLMONONOFF INDUCTION FOR REAL WORLD AGI?
>
Well ;-)

In a certain literal sense, not much, as it is not computable.  However,
many
practical methods in machine learning and statistics can be viewed as
computable
approximations of Solomonoff induction, and things like NCD have been used
in practice with some success.  And who knows, perhaps some very smart
person will come up with a new version of Solomonoff induction that is much
more practically useful.  Personally, I suspect other approaches will reach
human
level AGI first.

If you are interested in this topic, I'm currently finishing off the last
bits of my
PhD thesis in which I have a chapter that explains Solomonoff induction and
AIXI in the most complete and simple way I could manage to come up with.
This should be available before too long once I've done some final polishing
up etc. of the text.

Cheers,
Shane

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=63490419-d1495d

Reply via email to