[agi] Occam's Razor and its abuse

Eric Baum Tue, 28 Oct 2008 17:10:00 -0700

Pei> Triggered by several recent discussions, I'd like to make the
Pei> following position statement, though won't commit myself to long
Pei> debate on it. ;-)


Pei> Occam's Razor, in its original form, goes like "entities must not
Pei> be multiplied beyond necessity", and it is often stated as "All
Pei> other things being equal, the simplest solution is the best" or
Pei> "when multiple competing theories are equal in other respects,
Pei> the principle recommends selecting the theory that introduces the
Pei> fewest assumptions and postulates the fewest entities" --- all
Pei> from http://en.wikipedia.org/wiki/Occam's_razor

Pei> I fully agree with all of the above statements.

Pei> However, to me, there are two common misunderstandings associated
Pei> with it in the context of AGI and philosophy of science.

Pei> (1) To take this statement as self-evident or a stand-alone
Pei> postulate

Pei> To me, it is derived or implied by the insufficiency of
Pei> resources. If a system has sufficient resources, it has no good
Pei> reason to prefer a simpler theory.

With all due respect, this is mistaken. 
Occam's Razor, in some form, is the heart of Generalization, which
is the essence (and G) of GI.

For example, if you study concept learning from examples,
say in the PAC learning context (related theorems
hold in some other contexts as well), 
there are theorems to the effect that if you find
a hypothesis from a simple enough class of a hypotheses
it will with very high probability accurately classify new 
examples chosen from the same distribution, 

and conversely theorems that state (roughly speaking) that
any method that chooses a hypothesis from too expressive a class
of hypotheses will have a probability that can be bounded below
by some reasonable number like 1/7,
of having large error in its predictions on new examples--
in other words it is impossible to PAC learn without respecting
Occam's Razor.

For discussion of the above paragraphs, I'd refer you to
Chapter 4 of What is Thought? (MIT Press, 2004).

In other words, if you are building some system that learns
about the world, it had better respect Occam's razor if you
want whatever it learns to apply to new experience. 
(I use the term Occam's razor loosely; using
hypotheses that are highly constrained in ways other than
just being concise may work, but you'd better respect
"simplicity" broadly defined. See Chap 6 of WIT? for
more discussion of this point.)

The core problem of GI is generalization: you want to be able to
figure out new problems as they come along that you haven't seen
before. In order to do that, you basically must implicitly or
explicitly employ some version
of Occam's Razor, independent of how much resources you have.

In my view, the first and most important question to ask about
any proposal for AGI is, in what way is it going to produce
Occam hypotheses. If you can't answer that, don't bother implementing
a huge system in hopes of capturing your many insights, because
the bigger your implementation gets, the less likely it is to 
get where you want in the end.


-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com

[agi] Occam's Razor and its abuse

Reply via email to