Jim, what evidence do you have that Occam's Razor or algorithmic information theory is wrong, besides your own opinions? It is well established that elegant (short) theories are preferred in all branches of science because they have greater predictive power.
Also, what does this have to do with Cantor's diagonalization argument? AIT considers only the countably infinite set of hypotheses. -- Matt Mahoney, [email protected] ________________________________ From: Jim Bromer <[email protected]> To: agi <[email protected]> Sent: Wed, June 30, 2010 9:13:44 AM Subject: Re: [agi] Re: Huge Progress on the Core of AGI On Tue, Jun 29, 2010 at 11:46 PM, Abram Demski <[email protected]> wrote: In brief, the answer to your question is: we formalize the description length heuristic by assigning lower probabilities to longer hypotheses, and we apply Bayes law to update these probabilities given the data we observe. This updating captures the idea that we should reward theories which explain/expect more of the observations; it also provides a natural way to balance simplicity vs explanatory power, so that we can compare any two theories with a single scoring mechanism. Bayes Law automatically places the right amount of pressure to avoid overly elegant explanations which don't get much right, and to avoid overly complex explanations which fit the observations perfectly but which probably won't generalize to new data. ... If you go down this path, you will eventually come to understand (and, probably, accept) algorithmic information theory. Matt may be tring to force it on you too soon. :) --Abram David was asking about theories of explanation, and here you are suggesting that following a certain path of reasoning will lead to accepting AIT. What nonsense. Even assuming that Baye's law can be used to update probabilities of idealized utility, the connection between description length and explanatory power in general AI is tenuous. And when you realize that AIT is an unattainable idealism that lacks mathematical power (I do not believe that it is a valid mathematical method because it is incomputable and therefore innumerable and cannot be used to derive probability distributions even as ideals) you have to accept that the connection between explanatory theories and AIT is not established except as a special case based on the imagination that a similarities between a subclass of practical examples is the same as a powerful generalization of those examples. The problem is that while compression seems to be related to intelligence, it is not equivalent to intelligence. A much stronger but similarly false argument is that memory is intelligence. Of course memory is a major part of intelligence, but it is not everything. The argument that AIT is a reasonable substitute for developing more sophisticated theories about conceptual explanation is not well founded, it lacks any experimental evidence other than a spattering of results on simplistic cases, and it is just wrong to suggest that there is no reason to consider other theories of explanation. Yes compression has something to do with intelligence and, in some special cases it can be shown to act as an idealism for numerical rationality. And yes unattainable theories that examine the boundaries of productive mathematical systems is a legitimate subject for mathematics. But there is so much more to theories of explanatory reasoning that I genuinely feel sorry for those of you, who originally motivated to develop better AGI programs, would get caught in the obvious traps of AIT and AIXI. Jim Bromer On Tue, Jun 29, 2010 at 11:46 PM, Abram Demski <[email protected]> wrote: David, > >What Matt is trying to explain is all right, but I think a better way of >answering your question would be to invoke the mighty mysterious Bayes' Law. > >I had an epiphany similar to yours (the one that started this thread) about 5 >years ago now. At the time I did not know that it had all been done before. I >think many people feel this way about MDL. Looking into the MDL (minimum >description length) literature would be a good starting point. > >In brief, the answer to your question is: we formalize the description length >heuristic by assigning lower probabilities to longer hypotheses, and we apply >Bayes law to update these probabilities given the data we observe. This >updating captures the idea that we should reward theories which explain/expect >more of the observations; it also provides a natural way to balance simplicity >vs explanatory power, so that we can compare any two theories with a single >scoring mechanism. Bayes Law automatically places the right amount of pressure >to avoid overly elegant explanations which don't get much right, and to avoid >overly complex explanations which fit the observations perfectly but which >probably won't generalize to new data. > >Bayes' Law and MDL have strong connections, though sometimes they part ways. >There are deep theorems here. For me it's good enough to note that if we're >using a maximally efficient code for our knowledge representation, they are >equivalent. (This in itself involves some deep math; I can explain if you're >interested, though I believe I've already posted a writeup to this list in the >past.) Bayesian updating is essentially equivalent to scoring hypotheses as: >hypothesis size + size of data's description using hypothesis. Lower scores >are better (as the score is approximately -log(probability)). > >If you go down this path, you will eventually come to understand (and, >probably, accept) algorithmic information theory. Matt may be tring to force >it on you too soon. :) > >--Abram > agi | Archives | Modify Your Subscription ------------------------------------------- agi Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: https://www.listbox.com/member/?member_id=8660244&id_secret=8660244-6e7fb59c Powered by Listbox: http://www.listbox.com
