Jim, what evidence do you have that Occam's Razor or algorithmic information 
theory is wrong, besides your own opinions? It is well established that elegant 
(short) theories are preferred in all branches of science because they have 
greater predictive power.

Also, what does this have to do with Cantor's diagonalization argument? AIT 
considers only the countably infinite set of hypotheses.

 -- Matt Mahoney, [email protected]




________________________________
From: Jim Bromer <[email protected]>
To: agi <[email protected]>
Sent: Wed, June 30, 2010 9:13:44 AM
Subject: Re: [agi] Re: Huge Progress on the Core of AGI


On Tue, Jun 29, 2010 at 11:46 PM, Abram Demski <[email protected]> wrote:
In brief, the answer to your question is: we formalize the description length 
heuristic by assigning lower probabilities to longer hypotheses, and we apply 
Bayes law to update these probabilities given the data we observe. This 
updating captures the idea that we should reward theories which explain/expect 
more of the observations; it also provides a natural way to balance simplicity 
vs explanatory power, so that we can compare any two theories with a single 
scoring mechanism. Bayes Law automatically places the right amount of pressure 
to avoid overly elegant explanations which don't get much right, and to avoid 
overly complex explanations which fit the observations perfectly but which 
probably won't generalize to new data.
...
If you go down this path, you will eventually come to understand (and, 
probably, accept) algorithmic information theory. Matt may be tring to force it 
on you too soon. :)
--Abram 
 
David was asking about theories of explanation, and here you are suggesting 
that following a certain path of reasoning will lead to accepting AIT.  What 
nonsense.  Even assuming that Baye's law can be used to update probabilities of 
idealized utility, the connection between description length and explanatory 
power in general AI is tenuous.  And when you realize that AIT is an 
unattainable idealism that lacks mathematical power (I do not believe that it 
is a valid mathematical method because it is incomputable and therefore 
innumerable and cannot be used to derive probability distributions even as 
ideals) you have to accept that the connection between explanatory theories and 
AIT is not established except as a special case based on the imagination that a 
similarities between a subclass of practical examples is the same as a powerful 
generalization of those examples.  
 
The problem is that while compression seems to be related to intelligence, it 
is not equivalent to intelligence.  A much stronger but similarly false 
argument is that memory is intelligence.  Of course memory is a major part of 
intelligence, but it is not everything.  The argument that AIT is a reasonable 
substitute for developing more sophisticated theories about conceptual 
explanation is not well founded, it lacks any experimental evidence other than 
a spattering of results on simplistic cases, and it is just wrong to suggest 
that there is no reason to consider other theories of explanation.
 
Yes compression has something to do with intelligence and, in some special 
cases it can be shown to act as an idealism for numerical rationality.  And yes 
unattainable theories that examine the boundaries of productive mathematical 
systems is a legitimate subject for mathematics.  But there is so much more to 
theories of explanatory reasoning that I genuinely feel sorry for those of you, 
who originally motivated to develop better AGI programs, would get caught in 
the obvious traps of AIT and AIXI.
 
Jim Bromer 

 
On Tue, Jun 29, 2010 at 11:46 PM, Abram Demski <[email protected]> wrote:

David,
>
>What Matt is trying to explain is all right, but I think a better way of 
>answering your question would be to invoke the mighty mysterious Bayes' Law.
>
>I had an epiphany similar to yours (the one that started this thread) about 5 
>years ago now. At the time I did not know that it had all been done before. I 
>think many people feel this way about MDL. Looking into the MDL (minimum 
>description length) literature would be a good starting point.
>
>In brief, the answer to your question is: we formalize the description length 
>heuristic by assigning lower probabilities to longer hypotheses, and we apply 
>Bayes law to update these probabilities given the data we observe. This 
>updating captures the idea that we should reward theories which explain/expect 
>more of the observations; it also provides a natural way to balance simplicity 
>vs explanatory power, so that we can compare any two theories with a single 
>scoring mechanism. Bayes Law automatically places the right amount of pressure 
>to avoid overly elegant explanations which don't get much right, and to avoid 
>overly complex explanations which fit the observations perfectly but which 
>probably won't generalize to new data.
>
>Bayes' Law and MDL have strong connections, though sometimes they part ways. 
>There are deep theorems here. For me it's good enough to note that if we're 
>using a maximally efficient code for our knowledge representation, they are 
>equivalent. (This in itself involves some deep math; I can explain if you're 
>interested, though I believe I've already posted a writeup to this list in the 
>past.) Bayesian updating is essentially equivalent to scoring hypotheses as: 
>hypothesis size + size of data's description using hypothesis. Lower scores 
>are better (as the score is approximately -log(probability)).
>
>If you go down this path, you will eventually come to understand (and, 
>probably, accept) algorithmic information theory. Matt may be tring to force 
>it on you too soon. :)
>
>--Abram 
>
agi | Archives  | Modify Your Subscription  


-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Reply via email to