On Tue, Jun 29, 2010 at 11:46 PM, Abram Demski <[email protected]>wrote: In brief, the answer to your question is: we formalize the description length heuristic by assigning lower probabilities to longer hypotheses, and we apply Bayes law to update these probabilities given the data we observe. This updating captures the idea that we should reward theories which explain/expect more of the observations; it also provides a natural way to balance simplicity vs explanatory power, so that we can compare any two theories with a single scoring mechanism. Bayes Law automatically places the right amount of pressure to avoid overly elegant explanations which don't get much right, and to avoid overly complex explanations which fit the observations perfectly but which probably won't generalize to new data. ... If you go down this path, you will eventually come to understand (and, probably, accept) algorithmic information theory. Matt may be tring to force it on you too soon. :) --Abram
David was asking about theories of explanation, and here you are suggesting that following a certain path of reasoning will lead to accepting AIT. What nonsense. Even assuming that Baye's law can be used to update probabilities of idealized utility, the connection between description length and explanatory power in general AI is tenuous. And when you realize that AIT is an unattainable idealism that lacks mathematical power (I do not believe that it is a valid mathematical method because it is incomputable and therefore innumerable and cannot be used to derive probability distributions even as ideals) you have to accept that the connection between explanatory theories and AIT is not established except as a special case based on the imagination that a similarities between a subclass of practical examples is the same as a powerful generalization of those examples. The problem is that while compression seems to be related to intelligence, it is not equivalent to intelligence. A much stronger but similarly false argument is that memory is intelligence. Of course memory is a major part of intelligence, but it is not everything. The argument that AIT is a reasonable substitute for developing more sophisticated theories about conceptual explanation is not well founded, it lacks any experimental evidence other than a spattering of results on simplistic cases, and it is just wrong to suggest that there is no reason to consider other theories of explanation. Yes compression has something to do with intelligence and, in some special cases it can be shown to act as an idealism for numerical rationality. And yes unattainable theories that examine the boundaries of productive mathematical systems is a legitimate subject for mathematics. But there is so much more to theories of explanatory reasoning that I genuinely feel sorry for those of you, who originally motivated to develop better AGI programs, would get caught in the obvious traps of AIT and AIXI. Jim Bromer On Tue, Jun 29, 2010 at 11:46 PM, Abram Demski <[email protected]>wrote: > David, > > What Matt is trying to explain is all right, but I think a better way of > answering your question would be to invoke the mighty mysterious Bayes' Law. > > I had an epiphany similar to yours (the one that started this thread) about > 5 years ago now. At the time I did not know that it had all been done > before. I think many people feel this way about MDL. Looking into the MDL > (minimum description length) literature would be a good starting point. > > In brief, the answer to your question is: we formalize the description > length heuristic by assigning lower probabilities to longer hypotheses, and > we apply Bayes law to update these probabilities given the data we observe. > This updating captures the idea that we should reward theories which > explain/expect more of the observations; it also provides a natural way to > balance simplicity vs explanatory power, so that we can compare any two > theories with a single scoring mechanism. Bayes Law automatically places the > right amount of pressure to avoid overly elegant explanations which don't > get much right, and to avoid overly complex explanations which fit the > observations perfectly but which probably won't generalize to new data. > > Bayes' Law and MDL have strong connections, though sometimes they part > ways. There are deep theorems here. For me it's good enough to note that if > we're using a maximally efficient code for our knowledge representation, > they are equivalent. (This in itself involves some deep math; I can explain > if you're interested, though I believe I've already posted a writeup to this > list in the past.) Bayesian updating is essentially equivalent to scoring > hypotheses as: hypothesis size + size of data's description using > hypothesis. Lower scores are better (as the score is approximately > -log(probability)). > > If you go down this path, you will eventually come to understand (and, > probably, accept) algorithmic information theory. Matt may be tring to force > it on you too soon. :) > > --Abram > ------------------------------------------- agi Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: https://www.listbox.com/member/?member_id=8660244&id_secret=8660244-6e7fb59c Powered by Listbox: http://www.listbox.com
