Juergen Schmidhuber wrote:

>We need a prior probability distribution on possible histories. OK. I agree with that. But of course we differ on the meaning of "possible histories". And we tackle also the "prior probability" in quite different ways. >Then, once we have observed a past history, we can use Bayes' rule to >compute the probability of various futures, given the observations. Then >we predict the most probable future. It seems to me that Bayes' rule is what we *need* to build a plausible *past* before, and concerning the future, and even the present, I prefer a theory, or a philosophy (or religion), hoping it can put light on the unknown. >The algorithmic TOE paper discusses several formally describable priors >computable in the limit. But for the moment let me focus on my favorite, >the Speed Prior. Mmh... You know that there exist class of infinitely speedable programs (and theories), and even that those class correspond to a superclass of universal machines. This entails the UD, the Universal Dovetailer, which I like to call sometimes the 'splashed Universal Turing Machine', (which btw makes UD* sort of denotational semantics of the UTM) could generate and execute quicker and quicker version of itself. This means you need a non universal ontology. For helping other to understand the point I propose to await discussing this point after I give the solution of the problem in "diagonalisation 1": http://www.escribe.com/science/theory/m3079.html >Assume the Great Programmer has a resource problem, just like any >other programmer. You told Wei Dai not taking the Great Programmer to seriously. I am not sure I can imagine the Great Programmer having resource problem. Unless you are a finitist! Knowing that the UTM emulating the UD will use an unboundable part of its tape ... How could the GP have a resource problem when he is the one building universes in which resource problem can happen (apparently)? Are you not presuposing some physical structure in advance somewhere? >According to the Speed Prior, the harder it is to compute >something, the less likely it gets. More precisely, the cumulative prior >probability of all data whose computation costs at least O(n) time is >at most inversely proportional to n. Very interesting idea. I would say: the harder it is to compute something FROM here and now the less likely it gets from here and now. And I would add that this is only true if your neighborhood is less complex than yourself. I really believe your idea is interesting for some low level computational physics. But when agent, including yourself, enters in the play, its another matter, because it is intrinsically harder to compute things then. >To evaluate the plausibility of this, consider your own computer: most >processes just take a few microseconds, some take a few seconds, few >take hours, very few take days, etc... ? In the work of the GP, that is in UD*, most processes takes billions of googolplex kalpa ... A lot of processes engaged by the UD simply does not stop. Some are interesting and perhaps Turing Universal like the process raising better and better approximations of the Mandelbrot set. >We do not know the Great Programmer's computer and algorithm for computing >universe histories and other things. Still, we know that he cannot do >it more efficiently than the asymptotically fastest algorithm. Are you really sure about that. If the GP is really *the* GP, existing by Church Thesis, I think it follows from Blum Speed-up Theorem that it has no asymptotically fastest algorithm. But you are talking of "our little program" generated and executed by the UD perhaps. If this one has an asymptotically fastest algorithm, are you sure it still can emulate a UTM? I bet the "universe" is universal. Also: how could we know and test we are in a quick or slow version of that little program. Is that not (absolutely) undecidable. Should that program really be run? If yes, are you not presupposing again a physical universe? >The Speed Prior reflects properties of precisely this optimal algorithm. >It turns out that the best way of predicting future y, given past x, >is to minimize the (computable) Kt-complexity of (x,y). As observation >size grows, the predictions converge to the optimal predictions. As observation grows knowledge grows, and we grows making everything more complex that before. I guess you mean "optimal predictions" at some level. With comp the more knowledge grows, the more ignorance grows. >To address Bruno Marchal's questions: Many of my duplications in parallel >universes with identical past x will experience different futures y. >None of my copies knows a priori which one it is. But each does know: >Those futures y with high Kt(x,y) will be highly unlikely; those with >low Kt(x,y) won't. It is indeed hard to write a little program which generate a long Kolmogorov-Chaitin incompressible 01 string. But, as I told you earlier, it is quite easy to write a little program which generates them all. (It is the essence of the everything idea imo). Take for exemple the iterated duplication experience/experiment. It can be automated by a very simple program. After 64 iterations there will be 1,84467.10^19 agents, and most of them will have an incompressible 01-history. With the comp indeterminism it is much more likely you will get such a really random string (independently of the fact that you will be unable to prove it is really random). Those with computable strings will be exceptional, so that, if those agents work together they will consider (even with Bayes) that the simple self-multiplying algorithm is the simplest and shorter explanation, for those randomeness appearances. Bruno