Juergen Schmidhuber wrote:

>We need a prior probability distribution on possible histories.

OK. I agree with that. But of course we differ on the meaning of
"possible histories". And we tackle also the "prior probability"
in quite different ways. 

>Then, once we have observed a past history, we can use Bayes' rule to
>compute the probability of various futures, given the observations. Then
>we predict the most probable future.

It seems to me that Bayes' rule is what we *need* to build a plausible
*past* before, and concerning the future, and even the present, I prefer
a theory, or a philosophy (or religion), hoping it can put light on
the unknown.

>The algorithmic TOE paper discusses several formally describable priors
>computable in the limit. But for the moment let me focus on my favorite,
>the Speed Prior.

Mmh... You know that there exist class of infinitely speedable
programs (and theories), and even that those class correspond to a
superclass of universal machines. This entails the UD, 
the Universal Dovetailer, which I like to call sometimes
the 'splashed Universal Turing Machine',
(which btw makes UD* sort of denotational semantics of the UTM) could
generate and execute quicker and quicker version of itself. 
This means you need a non universal ontology. 
For helping other to understand the point I
propose to await discussing this point after I give the solution of the
problem in  "diagonalisation 1": 

>Assume the Great Programmer has a resource problem, just like any 
>other programmer.

You told Wei Dai not taking the Great Programmer to seriously.
I am not sure I can imagine the Great Programmer having resource
problem. Unless you are a finitist! Knowing that the UTM emulating
the UD will use an unboundable part of its tape ...
How could the GP have a resource problem when he is the one
building universes in which resource problem can happen (apparently)?
Are you not presuposing some physical structure in advance somewhere?

>According to the Speed Prior, the harder it is to compute
>something, the less likely it gets. More precisely, the cumulative prior
>probability of all data whose computation costs at least O(n) time is
>at most inversely proportional to n.

Very interesting idea. I would say: the harder it is to compute
something FROM here and now the less likely it gets from here and now. 
And I would
add that this is only true if your neighborhood is less complex than
yourself. I really believe your idea is interesting for some low level
computational physics. But when agent, including yourself, enters in the
play, its another matter, because it is intrinsically harder to compute
things then.

>To evaluate the plausibility of this, consider your own computer: most
>processes just take a few microseconds, some take a few seconds, few
>take hours, very few take days, etc...

In the work of the GP, that is in UD*, most processes takes billions
of googolplex kalpa ...
A lot of processes engaged by the UD simply does not stop.
Some are interesting and perhaps Turing Universal like the process
raising better and better approximations of the Mandelbrot

>We do not know the Great Programmer's computer and algorithm for computing
>universe histories and other things. Still, we know that he cannot do
>it more efficiently than the asymptotically fastest algorithm.

Are you really sure about that. If the GP is really *the* GP,
existing by Church Thesis, I think it follows from Blum Speed-up
Theorem that it has no asymptotically fastest algorithm.
But you are talking of "our little program" generated and executed
by the UD perhaps.
If this one has an asymptotically fastest algorithm, are you sure it
still can emulate a UTM? I bet the "universe" is universal.
Also:  how could we know and test we are in a quick or slow version of 
that little program. Is that not (absolutely) undecidable.
Should that program really be run? If yes, are you not presupposing
again a physical universe?

>The Speed Prior reflects properties of precisely this optimal algorithm.
>It turns out that the best way of predicting future y, given past x,
>is to minimize the (computable) Kt-complexity of (x,y).  As observation
>size grows, the predictions converge to the optimal predictions.

As observation grows knowledge grows, and we grows making everything
more complex that before. I guess you mean "optimal predictions" at 
some level. With comp the more knowledge grows, the more ignorance

>To address Bruno Marchal's questions: Many of my duplications in parallel
>universes with identical past x will experience different futures y.
>None of my copies knows a priori which one it is. But each does know:
>Those futures y with high Kt(x,y) will be highly unlikely; those with
>low Kt(x,y) won't.

It is indeed hard to write a little program which generate a long
Kolmogorov-Chaitin incompressible 01 string.
But, as I told you earlier, it is quite easy to write a little
program which generates them all. (It is the essence of the
everything idea imo).

Take for exemple the iterated duplication experience/experiment. 
It can be automated by a very simple program. After 64 iterations
there will be 1,84467.10^19 agents, and most of them will have
an incompressible 01-history. With the comp indeterminism it is
much more likely you will get such a really random string 
(independently of the fact that you will be unable to prove it is
really random). Those with computable strings will be exceptional, so
that, if those agents work together they will consider (even with
Bayes) that the simple self-multiplying algorithm is the simplest
and shorter explanation, for those randomeness appearances.



Reply via email to