Occam's Razor is not a provable theory, and I have come across philosophers
of science who also question it's value as a scientific heuristic.  I can
look for some more thorough presentations and I am willing to give you some
of my opinions on that question if you want me to.  The "evidence" would be
drawn from cases where a simpler theory was later wiped out because a more
complicated theory better explained the results of scientificly
gathered evidence.  Einstein's Relativity theories added more complexity to
Newton's theories, but they better explained details and results that
Newton's theories.

I cannot recall the details of why I believe that the central premise of
algorithmic information theory is incomputable due to Cantor's
diagonalization argument but I thought they underlay the reasoning of
Chaitin's Incompleteness Theory
http://en.wikipedia.org/wiki/Kolmogorov_complexity and that this is why the
shortest programs that output a string cannot, in general, be computed.

Jim Bromer
On Wed, Jun 30, 2010 at 5:13 PM, Matt Mahoney <[email protected]> wrote:

>   Jim, what evidence do you have that Occam's Razor or algorithmic
> information theory is wrong, besides your own opinions? It is well
> established that elegant (short) theories are preferred in all branches of
> science because they have greater predictive power.
>
> Also, what does this have to do with Cantor's diagonalization argument? AIT
> considers only the countably infinite set of hypotheses.
>
>
> -- Matt Mahoney, [email protected]
>
>
>  ------------------------------
> *From:* Jim Bromer <[email protected]>
>
> *To:* agi <[email protected]>
> *Sent:* Wed, June 30, 2010 9:13:44 AM
> *Subject:* Re: [agi] Re: Huge Progress on the Core of AGI
>
> On Tue, Jun 29, 2010 at 11:46 PM, Abram Demski <[email protected]>wrote:
> In brief, the answer to your question is: we formalize the description
> length heuristic by assigning lower probabilities to longer hypotheses, and
> we apply Bayes law to update these probabilities given the data we observe.
> This updating captures the idea that we should reward theories which
> explain/expect more of the observations; it also provides a natural way to
> balance simplicity vs explanatory power, so that we can compare any two
> theories with a single scoring mechanism. Bayes Law automatically places the
> right amount of pressure to avoid overly elegant explanations which don't
> get much right, and to avoid overly complex explanations which fit the
> observations perfectly but which probably won't generalize to new data.
> ...
> If you go down this path, you will eventually come to understand (and,
> probably, accept) algorithmic information theory. Matt may be tring to force
> it on you too soon. :)
> --Abram
>
> David was asking about theories of explanation, and here you are suggesting
> that following a certain path of reasoning will lead to accepting AIT.  What
> nonsense.  Even assuming that Baye's law can be used to update probabilities
> of idealized utility, the connection between description length and
> explanatory power in general AI is tenuous.  And when you realize that AIT
> is an unattainable idealism that lacks mathematical power (I do not believe
> that it is a valid mathematical method because it is incomputable and
> therefore innumerable and cannot be used to derive probability distributions
> even as ideals) you have to accept that the connection between explanatory
> theories and AIT is not established except as a special case based on the
> imagination that a similarities between a subclass of practical examples is
> the same as a powerful generalization of those examples.
>
> The problem is that while compression seems to be related to intelligence,
> it is not equivalent to intelligence.  A much stronger but similarly false
> argument is that memory is intelligence.  Of course memory is a major part
> of intelligence, but it is not everything.  The argument that AIT is a
> reasonable substitute for developing more sophisticated theories about
> conceptual explanation is not well founded, it lacks any experimental
> evidence other than a spattering of results on simplistic cases, and it is
> just wrong to suggest that there is no reason to consider other theories of
> explanation.
>
> Yes compression has something to do with intelligence and, in some special
> cases it can be shown to act as an idealism for numerical rationality.  And
> yes unattainable theories that examine the boundaries of productive
> mathematical systems is a legitimate subject for mathematics.  But there is
> so much more to theories of explanatory reasoning that I genuinely feel
> sorry for those of you, who originally motivated to develop better AGI
> programs, would get caught in the obvious traps of AIT and AIXI.
>
> Jim Bromer
>
>
> On Tue, Jun 29, 2010 at 11:46 PM, Abram Demski <[email protected]>wrote:
>
>> David,
>>
>> What Matt is trying to explain is all right, but I think a better way of
>> answering your question would be to invoke the mighty mysterious Bayes' Law.
>>
>> I had an epiphany similar to yours (the one that started this thread)
>> about 5 years ago now. At the time I did not know that it had all been done
>> before. I think many people feel this way about MDL. Looking into the MDL
>> (minimum description length) literature would be a good starting point.
>>
>> In brief, the answer to your question is: we formalize the description
>> length heuristic by assigning lower probabilities to longer hypotheses, and
>> we apply Bayes law to update these probabilities given the data we observe.
>> This updating captures the idea that we should reward theories which
>> explain/expect more of the observations; it also provides a natural way to
>> balance simplicity vs explanatory power, so that we can compare any two
>> theories with a single scoring mechanism. Bayes Law automatically places the
>> right amount of pressure to avoid overly elegant explanations which don't
>> get much right, and to avoid overly complex explanations which fit the
>> observations perfectly but which probably won't generalize to new data.
>>
>> Bayes' Law and MDL have strong connections, though sometimes they part
>> ways. There are deep theorems here. For me it's good enough to note that if
>> we're using a maximally efficient code for our knowledge representation,
>> they are equivalent. (This in itself involves some deep math; I can explain
>> if you're interested, though I believe I've already posted a writeup to this
>> list in the past.) Bayesian updating is essentially equivalent to scoring
>> hypotheses as: hypothesis size + size of data's description using
>> hypothesis. Lower scores are better (as the score is approximately
>> -log(probability)).
>>
>> If you go down this path, you will eventually come to understand (and,
>> probably, accept) algorithmic information theory. Matt may be tring to force
>> it on you too soon. :)
>>
>> --Abram
>>
>   *agi* | Archives <https://www.listbox.com/member/archive/303/=now>
> <https://www.listbox.com/member/archive/rss/303/> | 
> Modify<https://www.listbox.com/member/?&;>Your Subscription
> <http://www.listbox.com/>
>    *agi* | Archives <https://www.listbox.com/member/archive/303/=now>
> <https://www.listbox.com/member/archive/rss/303/> | 
> Modify<https://www.listbox.com/member/?&;>Your Subscription
> <http://www.listbox.com/>
>



-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Reply via email to