Hello Edward, I'm glad you found some of the writing and links interesting. Let me try to answer some of your questions.
> I understand the basic idea that if you are seeking a prior likelihood for > the occurrence of an event and you have no data about the frequency of its > occurrence -- absent any other knowledge -- some notion of the complexity, > in information-theory terms, of the event might help you make a better > guess. This makes sense because reality is a big computer, and complexity > -- in terms of the number of combined events required to make reality cough > up a given event , and the complexity of the space in which those events are > to be combined -- should to some extent be related to the event's > probability. I can understand how such complexity could be approximated by > the length of code required in some a theoretical Universal computer to > model such real world event-occurrence complexity. > This seems like a reasonable summary to me. Ok, let's take your example as I think it captures the essence of what you are getting at: > So what I am saying is, for example, that if you are receiving a sequence > of bytes from a video camera, much of the complexity in the input stream > might not be related to complexity-of-event-creation- or Occam's-razor-type > issues, but rather to complexity of perception, or similarity understanding, > or of appropriate context selection, factors which are not themselves > necessarily related to complexity of occurrence. > In short, yes. So, for example, if you hooked up a Solomonoff induction machine to a video camera it would first need to, in some sense, "understand" the nature of this input stream. This may be more complex than what the camera is actually looking at! Given that the Solomonoff machine starts from zero knowledge of the world, other than a special form of prior knowledge provided by the universal prior, there is no way around this problem. Somehow it has to learn this stuff. The good news, as Solomonoff proved, is that if the encoding of the video input stream isn't too crazy complex ( i.e. there exists a good algorithm to process the stream that isn't too long), then the Solomonoff machine will very quickly work out how to understand the video stream. Furthermore, if we put such a camera on a robot or something wandering around in the world, then it would not take long at all before the complexity of the observed world far surpassed the complexity of the video stream encoding. Perhaps what you should consider is a Solomonoff machine that has been pre-trained to do, say, vision. That is, you get the machine and train it up on some simple vision input so that it understands the nature of this input. Only then do you look at how well it performs at finding structure in the world though its visual input stream. Furthermore, I am saying that for an AGI it seems to me it would make much > more sense to attempt to derive priors from notions of similarity, of > probabilities of similar things, events, and contexts, and from things like > causal models for similar or generalized classes. There is usually much > from reality that we do know that we can, and do, use when learning about > things we don't know. > Yes. In essence what you seem to be saying is that our prior knowledge of the world strongly biases how we interpret new information. So, to use your example, we all know that people living on a small isolated island are probably genetically quite similar to each other. Thus, if we see that one has brown skin, we will guess that the others probably do also. However, weight is not so closely tied to genetics, and so if one is obese then this does not tell us much about how much other islanders weigh. Out-of-the-box a Solomonoff machine doesn't know anything about genetics and weight, so it can't make such inferences based on seeing just one islander. However, if it did have prior experience with genetics etc., then it too would generalise as you describe using context. Perhaps the best place to understand the theory of this is section 8.3 from "An Introduction to Kolmogorov complexity and its Applications" by Li and Vitanyi. You can also find some approximations to this theory that have been applied in practice to many diverse problems under the title of "Normalized Compression Distance" or NCD. A lot of this work has been done by Rudi Cilibrasi. HOW VALUABLE IS SOLMONONOFF INDUCTION FOR REAL WORLD AGI? > Well ;-) In a certain literal sense, not much, as it is not computable. However, many practical methods in machine learning and statistics can be viewed as computable approximations of Solomonoff induction, and things like NCD have been used in practice with some success. And who knows, perhaps some very smart person will come up with a new version of Solomonoff induction that is much more practically useful. Personally, I suspect other approaches will reach human level AGI first. If you are interested in this topic, I'm currently finishing off the last bits of my PhD thesis in which I have a chapter that explains Solomonoff induction and AIXI in the most complete and simple way I could manage to come up with. This should be available before too long once I've done some final polishing up etc. of the text. Cheers, Shane ----- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244&id_secret=63490419-d1495d