subject:"\[agi\] Occam's Razor and its abuse"

Re: [agi] Occam's Razor and its abuse

2008-10-31 Thread Mark Waser


Let's try this . . . .

In Universal Algorithmic Intelligence on page 20, Hutter uses Occam's razor 
in the definition of .


Then, at the bottom of the page, he merely claims that using  as an 
estimate for ? may be a reasonable thing to do


That's not a proof of Occam's Razor.

= = = = = =

He also references Occam's Razor on page 33 where he says:

We believe the answer to be negative, which on the positive side would show 
the necessity of Occam's razor assumption, and the distinguishedness of 
AIXI.


That's calling Occam's razor a necessary assumption and bases that upon a 
*belief*.


= = = = = =

Where do you believe that he proves Occam's razor?


- Original Message - 
From: Matt Mahoney [EMAIL PROTECTED]

To: agi@v2.listbox.com
Sent: Wednesday, October 29, 2008 10:46 PM
Subject: Re: [agi] Occam's Razor and its abuse



--- On Wed, 10/29/08, Mark Waser [EMAIL PROTECTED] wrote:


Hutter *defined* the measure of correctness using
simplicity as a component.
Of course, they're correlated when you do such a thing.
 That's not a proof,
that's an assumption.


Hutter defined the measure of correctness as the accumulated reward by the 
agent in AIXI.


-- Matt Mahoney, [EMAIL PROTECTED]



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?;

Powered by Listbox: http://www.listbox.com






---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com

Re: [agi] Occam's Razor and its abuse

2008-10-31 Thread Matt Mahoney

I think Hutter is being modest.

-- Matt Mahoney, [EMAIL PROTECTED]

--- On Fri, 10/31/08, Mark Waser [EMAIL PROTECTED] wrote:

 From: Mark Waser [EMAIL PROTECTED]
 Subject: Re: [agi] Occam's Razor and its abuse
 To: agi@v2.listbox.com
 Date: Friday, October 31, 2008, 5:41 PM
 Let's try this . . . .

 In Universal Algorithmic Intelligence on page 20, Hutter
 uses Occam's razor in the definition of .

 Then, at the bottom of the page, he merely claims that
 using  as an estimate for ? may be a reasonable thing
 to do

 That's not a proof of Occam's Razor.

 = = = = = =

 He also references Occam's Razor on page 33 where he
 says:

 We believe the answer to be negative, which on the
 positive side would show the necessity of Occam's razor
 assumption, and the distinguishedness of AIXI.

 That's calling Occam's razor a necessary assumption
 and bases that upon a *belief*.

 = = = = = =

 Where do you believe that he proves Occam's razor?

 - Original Message - From: Matt Mahoney
 [EMAIL PROTECTED]
 To: agi@v2.listbox.com
 Sent: Wednesday, October 29, 2008 10:46 PM
 Subject: Re: [agi] Occam's Razor and its abuse

  --- On Wed, 10/29/08, Mark Waser
 [EMAIL PROTECTED] wrote:

  Hutter *defined* the measure of correctness using
  simplicity as a component.
  Of course, they're correlated when you do such
 a thing.
   That's not a proof,
  that's an assumption.

  Hutter defined the measure of correctness as the
 accumulated reward by the agent in AIXI.

  -- Matt Mahoney, [EMAIL PROTECTED]

  ---
  agi
  Archives:
 https://www.listbox.com/member/archive/303/=now
  RSS Feed:
 https://www.listbox.com/member/archive/rss/303/
  Modify Your Subscription:
 https://www.listbox.com/member/?;
  Powered by Listbox: http://www.listbox.com

 ---
 agi
 Archives: https://www.listbox.com/member/archive/303/=now
 RSS Feed: https://www.listbox.com/member/archive/rss/303/
 Modify Your Subscription:
 https://www.listbox.com/member/?;
 Powered by Listbox: http://www.listbox.com

---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com

Re: [agi] Occam's Razor and its abuse

2008-10-31 Thread Mark Waser


I think Hutter is being modest.


Huh?

So . . . . are you going to continue claiming that Occam's Razor is proved 
or are you going to stop (or are you going to point me to the proof)?


- Original Message - 
From: Matt Mahoney [EMAIL PROTECTED]

To: agi@v2.listbox.com
Sent: Friday, October 31, 2008 5:54 PM
Subject: Re: [agi] Occam's Razor and its abuse



I think Hutter is being modest.

-- Matt Mahoney, [EMAIL PROTECTED]


--- On Fri, 10/31/08, Mark Waser [EMAIL PROTECTED] wrote:


From: Mark Waser [EMAIL PROTECTED]
Subject: Re: [agi] Occam's Razor and its abuse
To: agi@v2.listbox.com
Date: Friday, October 31, 2008, 5:41 PM
Let's try this . . . .

In Universal Algorithmic Intelligence on page 20, Hutter
uses Occam's razor in the definition of .

Then, at the bottom of the page, he merely claims that
using  as an estimate for ? may be a reasonable thing
to do

That's not a proof of Occam's Razor.

= = = = = =

He also references Occam's Razor on page 33 where he
says:

We believe the answer to be negative, which on the
positive side would show the necessity of Occam's razor
assumption, and the distinguishedness of AIXI.

That's calling Occam's razor a necessary assumption
and bases that upon a *belief*.

= = = = = =

Where do you believe that he proves Occam's razor?


- Original Message - From: Matt Mahoney
[EMAIL PROTECTED]
To: agi@v2.listbox.com
Sent: Wednesday, October 29, 2008 10:46 PM
Subject: Re: [agi] Occam's Razor and its abuse


 --- On Wed, 10/29/08, Mark Waser
[EMAIL PROTECTED] wrote:

 Hutter *defined* the measure of correctness using
 simplicity as a component.
 Of course, they're correlated when you do such
a thing.
  That's not a proof,
 that's an assumption.

 Hutter defined the measure of correctness as the
accumulated reward by the agent in AIXI.

 -- Matt Mahoney, [EMAIL PROTECTED]



 ---
 agi
 Archives:
https://www.listbox.com/member/archive/303/=now
 RSS Feed:
https://www.listbox.com/member/archive/rss/303/
 Modify Your Subscription:
https://www.listbox.com/member/?;
 Powered by Listbox: http://www.listbox.com





---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription:
https://www.listbox.com/member/?;
Powered by Listbox: http://www.listbox.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?;

Powered by Listbox: http://www.listbox.com






---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com

Re: [agi] Occam's Razor and its abuse

2008-10-29 Thread Mark Waser


(1) Simplicity (in conclusions, hypothesis, theories, etc.) is preferred.
(2) The preference to simplicity does not need a reason or justification.
(3) Simplicity is preferred because it is correlated with correctness.
I agree with (1), but not (2) and (3).


I concur but would add that (4) Simplicity is preferred because it is 
correlated with correctness *of implementation* (or ease of implementation 
correctly :-)



- Original Message - 
From: Pei Wang [EMAIL PROTECTED]

To: agi@v2.listbox.com
Sent: Tuesday, October 28, 2008 10:15 PM
Subject: Re: [agi] Occam's Razor and its abuse



Eric,

I highly respect your work, though we clearly have different opinions
on what intelligence is, as well as on how to achieve it. For example,
though learning and generalization play central roles in my theory
about intelligence, I don't think PAC learning (or the other learning
algorithms proposed so far) provides a proper conceptual framework for
the typical situation of this process. Generally speaking, I'm not
building some system that learns about the world, in the sense that
there is a correct way to describe the world waiting to be discovered,
which can be captured by some algorithm. Instead, learning to me is a
non-algorithmic open-ended process by which the system summarizes its
own experience, and uses it to predict the future. I fully understand
that most people in this field probably consider this opinion wrong,
though I haven't been convinced yet by the arguments I've seen so far.

Instead of addressing all of the relevant issues, in this discussion I
have a very limited goal. To rephrase what I said initially, I see
that under the term Occam's Razor, currently there are three
different statements:

(1) Simplicity (in conclusions, hypothesis, theories, etc.) is preferred.

(2) The preference to simplicity does not need a reason or justification.

(3) Simplicity is preferred because it is correlated with correctness.

I agree with (1), but not (2) and (3). I know many people have
different opinions, and I don't attempt to argue with them here ---
these problems are too complicated to be settled by email exchanges.

However, I do hope to convince people in this discussion that the
three statements are not logically equivalent, and (2) and (3) are not
implied by (1), so to use Occam's Razor to refer to all of them is
not a good idea, because it is going to mix different issues.
Therefore, I suggest people to use Occam's Razor in its original and
basic sense, that is (1), and to use other terms to refer to (2) and
(3). Otherwise, when people talk about Occam's Razor, I just don't
know what to say.

Pei

On Tue, Oct 28, 2008 at 8:09 PM, Eric Baum [EMAIL PROTECTED] wrote:


Pei Triggered by several recent discussions, I'd like to make the
Pei following position statement, though won't commit myself to long
Pei debate on it. ;-)

Pei Occam's Razor, in its original form, goes like entities must not
Pei be multiplied beyond necessity, and it is often stated as All
Pei other things being equal, the simplest solution is the best or
Pei when multiple competing theories are equal in other respects,
Pei the principle recommends selecting the theory that introduces the
Pei fewest assumptions and postulates the fewest entities --- all
Pei from http://en.wikipedia.org/wiki/Occam's_razor

Pei I fully agree with all of the above statements.

Pei However, to me, there are two common misunderstandings associated
Pei with it in the context of AGI and philosophy of science.

Pei (1) To take this statement as self-evident or a stand-alone
Pei postulate

Pei To me, it is derived or implied by the insufficiency of
Pei resources. If a system has sufficient resources, it has no good
Pei reason to prefer a simpler theory.

With all due respect, this is mistaken.
Occam's Razor, in some form, is the heart of Generalization, which
is the essence (and G) of GI.

For example, if you study concept learning from examples,
say in the PAC learning context (related theorems
hold in some other contexts as well),
there are theorems to the effect that if you find
a hypothesis from a simple enough class of a hypotheses
it will with very high probability accurately classify new
examples chosen from the same distribution,

and conversely theorems that state (roughly speaking) that
any method that chooses a hypothesis from too expressive a class
of hypotheses will have a probability that can be bounded below
by some reasonable number like 1/7,
of having large error in its predictions on new examples--
in other words it is impossible to PAC learn without respecting
Occam's Razor.

For discussion of the above paragraphs, I'd refer you to
Chapter 4 of What is Thought? (MIT Press, 2004).

In other words, if you are building some system that learns
about the world, it had better respect Occam's razor if you
want whatever it learns to apply to new experience.
(I use the term Occam's razor loosely; using
hypotheses that are highly constrained

Re: [agi] Occam's Razor and its abuse

2008-10-29 Thread Mark Waser

Hutter proved (3), although as a general principle it was already a well 
established practice in machine learning. Also, I agree with (4) but this 
is not the primary reason to prefer simplicity.


Hutter *defined* the measure of correctness using simplicity as a component. 
Of course, they're correlated when you do such a thing.  That's not a proof, 
that's an assumption.


Regarding (4), I was deliberately ambiguous as to whether I meant 
implementation of thinking system or implementation of thought itself.


- Original Message - 
From: Matt Mahoney [EMAIL PROTECTED]

To: agi@v2.listbox.com
Sent: Wednesday, October 29, 2008 11:11 AM
Subject: Re: [agi] Occam's Razor and its abuse



--- On Wed, 10/29/08, Mark Waser [EMAIL PROTECTED] wrote:


 (1) Simplicity (in conclusions, hypothesis, theories,
 etc.) is preferred.
 (2) The preference to simplicity does not need a
 reason or justification.
 (3) Simplicity is preferred because it is correlated
 with correctness.
 I agree with (1), but not (2) and (3).

I concur but would add that (4) Simplicity is preferred
because it is
correlated with correctness *of implementation* (or ease of
implementation correctly :-)


Occam said (1) but had no proof. Hutter proved (3), although as a general 
principle it was already a well established practice in machine learning. 
Also, I agree with (4) but this is not the primary reason to prefer 
simplicity.


-- Matt Mahoney, [EMAIL PROTECTED]



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?;

Powered by Listbox: http://www.listbox.com






---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com

RE: [agi] Occam's Razor and its abuse

2008-10-29 Thread Ed Porter

Pei,

My understanding is that when you reason from data, you often want the
ability to extrapolate, which requires some sort of assumptions about the
type of mathematical model to be used.  How do you deal with that in NARS?

Ed Porter

-Original Message-
From: Pei Wang [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, October 28, 2008 9:40 PM
To: agi@v2.listbox.com
Subject: Re: [agi] Occam's Razor and its abuse


Ed,

Since NARS doesn't follow the Bayesian approach, there is no initial priors
to be assumed. If we use a more general term, such as initial knowledge or
innate beliefs, then yes, you can add them into the system, will will
improve the system's performance. However, they are optional. In NARS, all
object-level (i.e., not meta-level) innate beliefs can be learned by the
system afterward.

Pei

On Tue, Oct 28, 2008 at 5:37 PM, Ed Porter [EMAIL PROTECTED] wrote:
 It appears to me that the assumptions about initial priors used by a 
 self learning AGI or an evolutionary line of AGI's could be quite 
 minimal.

 My understanding is that once a probability distribution starts 
 receiving random samples from its distribution the effect of the 
 original prior becomes rapidly lost, unless it is a rather rare one.  
 Such rare problem priors would get selected against quickly by 
 evolution.  Evolution would tend to tune for the most appropriate 
 priors for the success of subsequent generations (either or computing 
 in the same system if it is capable of enough change or of descendant 
 systems).  Probably the best priors would generally be ones that could 
 be trained moderately rapidly by data.

 So it seems an evolutionary system or line could initially learn 
 priors without any assumptions for priors other than a random picking 
 of priors. Over time and multiple generations it might develop 
 hereditary priors, an perhaps even different hereditary priors for 
 parts of its network connected to different inputs, outputs or 
 internal controls.

 The use of priors in an AGI could be greatly improved by having a 
 gen/comp hiearachy in which models for a given concept could be 
 inherited from the priors of sets of models for similar concepts, and 
 that the set of priors appropriate could change contextually.  It 
 would also seem that the notion of a prior could be improve by 
 blending information from episodic and probabilistic models.

 It would appear than in almost any generally intelligent system, being 
 able to approximate reality in a manner sufficient for evolutionary 
 success with the most efficient representations would be a 
 characteristic that would be greatly preferred by evolution, because 
 it would allow systems to better model more of their environement 
 sufficiently well for evolutionary success with whatever current 
 modeling capacity they have.

 So, although a completely accurate description of virtually anything 
 may not find much use for Occam's Razor, as a practically useful 
 representation it often will.  It seems to me that Occam's Razor is 
 more oriented to deriving meaningful generalizations that it is exact 
 descriptions of anything.

 Furthermore, it would seem to me that a more simple set of 
 preconditions, is generally more probable than a more complex one, 
 because it requires less coincidence.  It would seem to me this would 
 be true under most random sets of priors for the probabilities of the 
 possible sets of components involved and Occam's Razor type selection.

 The are the musings of an untrained mind, since I have not spent much 
 time studying philosophy, because such a high percent of it was so 
 obviously stupid (such as what was commonly said when I was young, 
 that you can't have intelligence without language) and my 
 understanding of math is much less than that of many on this list.  
 But none the less I think much of what I have said above is true.

 I think its gist is not totally dissimilar to what Abram has said.

 Ed Porter




 -Original Message-
 From: Pei Wang [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, October 28, 2008 3:05 PM
 To: agi@v2.listbox.com
 Subject: Re: [agi] Occam's Razor and its abuse


 Abram,

 I agree with your basic idea in the following, though I usually put it 
 in different form.

 Pei

 On Tue, Oct 28, 2008 at 2:52 PM, Abram Demski [EMAIL PROTECTED] 
 wrote:
 Ben,

 You assert that Pei is forced to make an assumption about the 
 regulatiry of the world to justify adaptation. Pei could also take a 
 different argument. He could try to show that *if* a strategy exists 
 that can be implemented given the finite resources, NARS will 
 eventually find it. Thus, adaptation is justified on a sort of we 
 might as well try basis. (The proof would involve showing that NARS 
 searches the state of finite-state-machines that can be implemented 
 with the resources at hand, and is more probable to stay for longer 
 periods of time in configurations that give more reward, such that 
 NARS would eventually

Re: [agi] Occam's Razor and its abuse

2008-10-29 Thread Pei Wang

Ed,

When NARS extrapolates its past experience to the current and the
future, it is indeed based on the assumption that its future
experience will be similar to its past experience (otherwise any
prediction will be equally valid), however it does not assume the
world can be captured by any specific mathematical model, such as a
Turing Machine or a probability distribution defined on a
propositional space.

Concretely speaking, when a statement S has been tested N times, and
in M times it is true, but in N-M times it is false, then NARS's
expectation value for it to be true in the next testing is E(S) =
(M+0.5)/(N+1) [if there is no other relevant knowledge], and the
system will use this value to decide whether to accept a bet on S.
However, neither the system nor its designer assumes that there is a
true probability for S to occur for which the above expectation is
an approximation. Also, it is not assumed that E(S)  will converge
when the testing on S continues.

Pei


On Wed, Oct 29, 2008 at 11:33 AM, Ed Porter [EMAIL PROTECTED] wrote:
 Pei,

 My understanding is that when you reason from data, you often want the
 ability to extrapolate, which requires some sort of assumptions about the
 type of mathematical model to be used.  How do you deal with that in NARS?

 Ed Porter

 -Original Message-
 From: Pei Wang [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, October 28, 2008 9:40 PM
 To: agi@v2.listbox.com
 Subject: Re: [agi] Occam's Razor and its abuse


 Ed,

 Since NARS doesn't follow the Bayesian approach, there is no initial priors
 to be assumed. If we use a more general term, such as initial knowledge or
 innate beliefs, then yes, you can add them into the system, will will
 improve the system's performance. However, they are optional. In NARS, all
 object-level (i.e., not meta-level) innate beliefs can be learned by the
 system afterward.

 Pei

 On Tue, Oct 28, 2008 at 5:37 PM, Ed Porter [EMAIL PROTECTED] wrote:
 It appears to me that the assumptions about initial priors used by a
 self learning AGI or an evolutionary line of AGI's could be quite
 minimal.

 My understanding is that once a probability distribution starts
 receiving random samples from its distribution the effect of the
 original prior becomes rapidly lost, unless it is a rather rare one.
 Such rare problem priors would get selected against quickly by
 evolution.  Evolution would tend to tune for the most appropriate
 priors for the success of subsequent generations (either or computing
 in the same system if it is capable of enough change or of descendant
 systems).  Probably the best priors would generally be ones that could
 be trained moderately rapidly by data.

 So it seems an evolutionary system or line could initially learn
 priors without any assumptions for priors other than a random picking
 of priors. Over time and multiple generations it might develop
 hereditary priors, an perhaps even different hereditary priors for
 parts of its network connected to different inputs, outputs or
 internal controls.

 The use of priors in an AGI could be greatly improved by having a
 gen/comp hiearachy in which models for a given concept could be
 inherited from the priors of sets of models for similar concepts, and
 that the set of priors appropriate could change contextually.  It
 would also seem that the notion of a prior could be improve by
 blending information from episodic and probabilistic models.

 It would appear than in almost any generally intelligent system, being
 able to approximate reality in a manner sufficient for evolutionary
 success with the most efficient representations would be a
 characteristic that would be greatly preferred by evolution, because
 it would allow systems to better model more of their environement
 sufficiently well for evolutionary success with whatever current
 modeling capacity they have.

 So, although a completely accurate description of virtually anything
 may not find much use for Occam's Razor, as a practically useful
 representation it often will.  It seems to me that Occam's Razor is
 more oriented to deriving meaningful generalizations that it is exact
 descriptions of anything.

 Furthermore, it would seem to me that a more simple set of
 preconditions, is generally more probable than a more complex one,
 because it requires less coincidence.  It would seem to me this would
 be true under most random sets of priors for the probabilities of the
 possible sets of components involved and Occam's Razor type selection.

 The are the musings of an untrained mind, since I have not spent much
 time studying philosophy, because such a high percent of it was so
 obviously stupid (such as what was commonly said when I was young,
 that you can't have intelligence without language) and my
 understanding of math is much less than that of many on this list.
 But none the less I think much of what I have said above is true.

 I think its gist is not totally dissimilar to what Abram has said

Re: [agi] Occam's Razor and its abuse

2008-10-29 Thread Ben Goertzel

But, NARS as an overall software system will perform more effectively
(i.e., learn more rapidly) in
some environments than in others, for a variety of reasons.  There are many
biases built into the NARS architecture in various ways ... it's just not
obvious
to spell out what they are, because the NARS system was not explicitly
designed based on that sort of thinking...

The same is true of every other complex AGI architecture...

ben g


On Wed, Oct 29, 2008 at 12:07 PM, Pei Wang [EMAIL PROTECTED] wrote:

 Ed,

 When NARS extrapolates its past experience to the current and the
 future, it is indeed based on the assumption that its future
 experience will be similar to its past experience (otherwise any
 prediction will be equally valid), however it does not assume the
 world can be captured by any specific mathematical model, such as a
 Turing Machine or a probability distribution defined on a
 propositional space.

 Concretely speaking, when a statement S has been tested N times, and
 in M times it is true, but in N-M times it is false, then NARS's
 expectation value for it to be true in the next testing is E(S) =
 (M+0.5)/(N+1) [if there is no other relevant knowledge], and the
 system will use this value to decide whether to accept a bet on S.
 However, neither the system nor its designer assumes that there is a
 true probability for S to occur for which the above expectation is
 an approximation. Also, it is not assumed that E(S)  will converge
 when the testing on S continues.

 Pei


 On Wed, Oct 29, 2008 at 11:33 AM, Ed Porter [EMAIL PROTECTED] wrote:
  Pei,
 
  My understanding is that when you reason from data, you often want the
  ability to extrapolate, which requires some sort of assumptions about the
  type of mathematical model to be used.  How do you deal with that in
 NARS?
 
  Ed Porter
 
  -Original Message-
  From: Pei Wang [mailto:[EMAIL PROTECTED]
  Sent: Tuesday, October 28, 2008 9:40 PM
  To: agi@v2.listbox.com
  Subject: Re: [agi] Occam's Razor and its abuse
 
 
  Ed,
 
  Since NARS doesn't follow the Bayesian approach, there is no initial
 priors
  to be assumed. If we use a more general term, such as initial knowledge
 or
  innate beliefs, then yes, you can add them into the system, will will
  improve the system's performance. However, they are optional. In NARS,
 all
  object-level (i.e., not meta-level) innate beliefs can be learned by the
  system afterward.
 
  Pei
 
  On Tue, Oct 28, 2008 at 5:37 PM, Ed Porter [EMAIL PROTECTED] wrote:
  It appears to me that the assumptions about initial priors used by a
  self learning AGI or an evolutionary line of AGI's could be quite
  minimal.
 
  My understanding is that once a probability distribution starts
  receiving random samples from its distribution the effect of the
  original prior becomes rapidly lost, unless it is a rather rare one.
  Such rare problem priors would get selected against quickly by
  evolution.  Evolution would tend to tune for the most appropriate
  priors for the success of subsequent generations (either or computing
  in the same system if it is capable of enough change or of descendant
  systems).  Probably the best priors would generally be ones that could
  be trained moderately rapidly by data.
 
  So it seems an evolutionary system or line could initially learn
  priors without any assumptions for priors other than a random picking
  of priors. Over time and multiple generations it might develop
  hereditary priors, an perhaps even different hereditary priors for
  parts of its network connected to different inputs, outputs or
  internal controls.
 
  The use of priors in an AGI could be greatly improved by having a
  gen/comp hiearachy in which models for a given concept could be
  inherited from the priors of sets of models for similar concepts, and
  that the set of priors appropriate could change contextually.  It
  would also seem that the notion of a prior could be improve by
  blending information from episodic and probabilistic models.
 
  It would appear than in almost any generally intelligent system, being
  able to approximate reality in a manner sufficient for evolutionary
  success with the most efficient representations would be a
  characteristic that would be greatly preferred by evolution, because
  it would allow systems to better model more of their environement
  sufficiently well for evolutionary success with whatever current
  modeling capacity they have.
 
  So, although a completely accurate description of virtually anything
  may not find much use for Occam's Razor, as a practically useful
  representation it often will.  It seems to me that Occam's Razor is
  more oriented to deriving meaningful generalizations that it is exact
  descriptions of anything.
 
  Furthermore, it would seem to me that a more simple set of
  preconditions, is generally more probable than a more complex one,
  because it requires less coincidence.  It would seem to me this would
  be true

Re: [agi] Occam's Razor and its abuse

2008-10-29 Thread Pei Wang

Ben,

I never claimed that NARS is not based on assumptions (or call them
biases), but only on truths. It surely is, and many of the
assumptions are my beliefs and intuitions, which I cannot convince
other people to accept very soon.

However, it does not mean that all assumptions are equally acceptable,
or as soon as something is called a assumption, the author will be
released from the duty of justifying it.

Going back to the original topic, since simplicity/complexity of a
description is correlated with its prior probability is the core
assumption of certain research paradigms, it should be justified. Call
it Occam's Razor so as to suggest it is self-evident is not the
proper way to do the job. This is all I want to argue in this
discussion.

Pei

On Wed, Oct 29, 2008 at 12:10 PM, Ben Goertzel [EMAIL PROTECTED] wrote:

 But, NARS as an overall software system will perform more effectively
 (i.e., learn more rapidly) in
 some environments than in others, for a variety of reasons.  There are many
 biases built into the NARS architecture in various ways ... it's just not
 obvious
 to spell out what they are, because the NARS system was not explicitly
 designed based on that sort of thinking...

 The same is true of every other complex AGI architecture...

 ben g


 On Wed, Oct 29, 2008 at 12:07 PM, Pei Wang [EMAIL PROTECTED] wrote:

 Ed,

 When NARS extrapolates its past experience to the current and the
 future, it is indeed based on the assumption that its future
 experience will be similar to its past experience (otherwise any
 prediction will be equally valid), however it does not assume the
 world can be captured by any specific mathematical model, such as a
 Turing Machine or a probability distribution defined on a
 propositional space.

 Concretely speaking, when a statement S has been tested N times, and
 in M times it is true, but in N-M times it is false, then NARS's
 expectation value for it to be true in the next testing is E(S) =
 (M+0.5)/(N+1) [if there is no other relevant knowledge], and the
 system will use this value to decide whether to accept a bet on S.
 However, neither the system nor its designer assumes that there is a
 true probability for S to occur for which the above expectation is
 an approximation. Also, it is not assumed that E(S)  will converge
 when the testing on S continues.

 Pei


 On Wed, Oct 29, 2008 at 11:33 AM, Ed Porter [EMAIL PROTECTED] wrote:
  Pei,
 
  My understanding is that when you reason from data, you often want the
  ability to extrapolate, which requires some sort of assumptions about
  the
  type of mathematical model to be used.  How do you deal with that in
  NARS?
 
  Ed Porter
 
  -Original Message-
  From: Pei Wang [mailto:[EMAIL PROTECTED]
  Sent: Tuesday, October 28, 2008 9:40 PM
  To: agi@v2.listbox.com
  Subject: Re: [agi] Occam's Razor and its abuse
 
 
  Ed,
 
  Since NARS doesn't follow the Bayesian approach, there is no initial
  priors
  to be assumed. If we use a more general term, such as initial
  knowledge or
  innate beliefs, then yes, you can add them into the system, will will
  improve the system's performance. However, they are optional. In NARS,
  all
  object-level (i.e., not meta-level) innate beliefs can be learned by the
  system afterward.
 
  Pei
 
  On Tue, Oct 28, 2008 at 5:37 PM, Ed Porter [EMAIL PROTECTED] wrote:
  It appears to me that the assumptions about initial priors used by a
  self learning AGI or an evolutionary line of AGI's could be quite
  minimal.
 
  My understanding is that once a probability distribution starts
  receiving random samples from its distribution the effect of the
  original prior becomes rapidly lost, unless it is a rather rare one.
  Such rare problem priors would get selected against quickly by
  evolution.  Evolution would tend to tune for the most appropriate
  priors for the success of subsequent generations (either or computing
  in the same system if it is capable of enough change or of descendant
  systems).  Probably the best priors would generally be ones that could
  be trained moderately rapidly by data.
 
  So it seems an evolutionary system or line could initially learn
  priors without any assumptions for priors other than a random picking
  of priors. Over time and multiple generations it might develop
  hereditary priors, an perhaps even different hereditary priors for
  parts of its network connected to different inputs, outputs or
  internal controls.
 
  The use of priors in an AGI could be greatly improved by having a
  gen/comp hiearachy in which models for a given concept could be
  inherited from the priors of sets of models for similar concepts, and
  that the set of priors appropriate could change contextually.  It
  would also seem that the notion of a prior could be improve by
  blending information from episodic and probabilistic models.
 
  It would appear than in almost any generally intelligent system, being
  able to approximate reality in a manner

Re: [agi] Occam's Razor and its abuse

2008-10-29 Thread Ben Goertzel

 However, it does not mean that all assumptions are equally acceptable,
 or as soon as something is called a assumption, the author will be
 released from the duty of justifying it.



Hume argued that at the basis of any approach to induction, there will
necessarily lie some assumption that is *not* inductively justified, but
must in essence be taken on faith or as an unjustified assumption

He claimed that humans make certain unjustified assumptions of this nature
automatically due to human nature

This is an argument that not all assumptions can be expected to be justified
...

Comments?
ben g



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com

Re: [agi] Occam's Razor and its abuse

2008-10-29 Thread Pei Wang

Ben,

It goes back to what justification we are talking about. To prove
it is a strong version, and to show supporting evidence is a weak
version. Hume pointed out that induction cannot be justified in the
sense that there is no way to guarantee that all inductive conclusions
will be confirmed.

I don't think Hume can be cited to support the assumption that
complexity is correlated to probability, or that this assumption
does not need justification, just because inductive conclusions can be
wrong. There are much more reasons to accept induction than to accept
the above assumption.

Pei

On Wed, Oct 29, 2008 at 12:31 PM, Ben Goertzel [EMAIL PROTECTED] wrote:



 However, it does not mean that all assumptions are equally acceptable,
 or as soon as something is called a assumption, the author will be
 released from the duty of justifying it.

 Hume argued that at the basis of any approach to induction, there will
 necessarily lie some assumption that is *not* inductively justified, but
 must in essence be taken on faith or as an unjustified assumption

 He claimed that humans make certain unjustified assumptions of this nature
 automatically due to human nature

 This is an argument that not all assumptions can be expected to be justified
 ...

 Comments?
 ben g

 
 agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com

Re: [agi] Occam's Razor and its abuse

2008-10-29 Thread Matt Mahoney

--- On Tue, 10/28/08, Pei Wang [EMAIL PROTECTED] wrote:

 Whenever someone prove something outside mathematics, it is always
 based on certain assumptions. If the assumptions are not well
 justified, there is no strong reason for people to accept the
 conclusion, even though the proof process is correct.

My assumption is that the physics of the observable universe is computable 
(which is widely believed to be true). If it is true, then AIXI proves that 
Occam's Razor holds.

-- Matt Mahoney, [EMAIL PROTECTED]



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com

Re: [agi] Occam's Razor and its abuse

2008-10-29 Thread Matt Mahoney

--- On Wed, 10/29/08, Mark Waser [EMAIL PROTECTED] wrote:

 Hutter *defined* the measure of correctness using
 simplicity as a component. 
 Of course, they're correlated when you do such a thing.
  That's not a proof, 
 that's an assumption.

Hutter defined the measure of correctness as the accumulated reward by the 
agent in AIXI.

-- Matt Mahoney, [EMAIL PROTECTED]



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com

[agi] Occam's Razor and its abuse

2008-10-28 Thread Pei Wang

Triggered by several recent discussions, I'd like to make the
following position statement, though won't commit myself to long
debate on it. ;-)

Occam's Razor, in its original form, goes like entities must not be
multiplied beyond necessity, and it is often stated as All other
things being equal, the simplest solution is the best or when
multiple competing theories are equal in other respects, the principle
recommends selecting the theory that introduces the fewest assumptions
and postulates the fewest entities --- all from
http://en.wikipedia.org/wiki/Occam's_razor

I fully agree with all of the above statements.

However, to me, there are two common misunderstandings associated with
it in the context of AGI and philosophy of science.

(1) To take this statement as self-evident or a stand-alone postulate

To me, it is derived or implied by the insufficiency of resources. If
a system has sufficient resources, it has no good reason to prefer a
simpler theory.

(2) To take it to mean The simplest answer is usually the correct answer.

This is a very different statement, which cannot be justified either
analytically or empirically.  When theory A is an approximation of
theory B, usually the former is simpler than the latter, but less
correct or accurate, in terms of its relation with all available
evidence. When we are short in resources and have a low demand on
accuracy, we often prefer A over B, but it does not mean that by doing
so we judge A as more correct than B.

In summary, in choosing among alternative theories or conclusions, the
preference for simplicity comes from shortage of resources, though
simplicity and correctness are logically independent of each other.

Pei


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com

Re: [agi] Occam's Razor and its abuse

2008-10-28 Thread Pei Wang

Ben,

Thanks. So the other people now see that I'm not attacking a straw man.

My solution to Hume's problem, as embedded in the experience-grounded
semantics, is to assume no predictability, but to justify induction as
adaptation. However, it is a separate topic which I've explained in my
other publications.

Here I just want to point out that the original and basic meaning of
Occam's Razor and those two common (mis)usages of it are not
necessarily the same. I fully agree with the former, but not the
latter, and I haven't seen any convincing justification of the latter.
Instead, they are often taken as granted, under the name of Occam's
Razor.

Pei

On Tue, Oct 28, 2008 at 12:37 PM, Ben Goertzel [EMAIL PROTECTED] wrote:

 Hi Pei,

 This is an interesting perspective; I just want to clarify for others on the
 list that it is a particular and controversial perspective, and contradicts
 the perspectives of many other well-informed research professionals and deep
 thinkers on relevant topics.

 Many serious thinkers in the area *do* consider Occam's Razor a standalone
 postulate.  This fits in naturally with the Bayesian perspective, in which
 one needs to assume *some* prior distribution, so one often assumes some
 sort of Occam prior (e.g. the Solomonoff-Levin prior, the speed prior, etc.)
 as a standalone postulate.

 Hume pointed out that induction (in the old sense of extrapolating from the
 past into the future) is not solvable except by introducing some kind of a
 priori assumption.  Occam's Razor, in one form or another, is a suitable a
 prior assumption to plug into this role.

 If you want to replace the Occam's Razor assumption with the assumption that
 the world is predictable by systems with limited resources, and we will
 prefer explanations that consume less resources, that seems unproblematic
 as it's basically equivalent to assuming an Occam prior.

 On the other hand, I just want to point out that to get around Hume's
 complaint you do need to make *some* kind of assumption about the regularity
 of the world.  What kind of assumption of this nature underlies your work on
 NARS (if any)?

 ben

 On Tue, Oct 28, 2008 at 8:58 AM, Pei Wang [EMAIL PROTECTED] wrote:

 Triggered by several recent discussions, I'd like to make the
 following position statement, though won't commit myself to long
 debate on it. ;-)

 Occam's Razor, in its original form, goes like entities must not be
 multiplied beyond necessity, and it is often stated as All other
 things being equal, the simplest solution is the best or when
 multiple competing theories are equal in other respects, the principle
 recommends selecting the theory that introduces the fewest assumptions
 and postulates the fewest entities --- all from
 http://en.wikipedia.org/wiki/Occam's_razor

 I fully agree with all of the above statements.

 However, to me, there are two common misunderstandings associated with
 it in the context of AGI and philosophy of science.

 (1) To take this statement as self-evident or a stand-alone postulate

 To me, it is derived or implied by the insufficiency of resources. If
 a system has sufficient resources, it has no good reason to prefer a
 simpler theory.

 (2) To take it to mean The simplest answer is usually the correct
 answer.

 This is a very different statement, which cannot be justified either
 analytically or empirically.  When theory A is an approximation of
 theory B, usually the former is simpler than the latter, but less
 correct or accurate, in terms of its relation with all available
 evidence. When we are short in resources and have a low demand on
 accuracy, we often prefer A over B, but it does not mean that by doing
 so we judge A as more correct than B.

 In summary, in choosing among alternative theories or conclusions, the
 preference for simplicity comes from shortage of resources, though
 simplicity and correctness are logically independent of each other.

 Pei


 ---
 agi
 Archives: https://www.listbox.com/member/archive/303/=now
 RSS Feed: https://www.listbox.com/member/archive/rss/303/
 Modify Your Subscription: https://www.listbox.com/member/?;
 Powered by Listbox: http://www.listbox.com



 --
 Ben Goertzel, PhD
 CEO, Novamente LLC and Biomind LLC
 Director of Research, SIAI
 [EMAIL PROTECTED]

 A human being should be able to change a diaper, plan an invasion, butcher
 a hog, conn a ship, design a building, write a sonnet, balance accounts,
 build a wall, set a bone, comfort the dying, take orders, give orders,
 cooperate, act alone, solve equations, analyze a new problem, pitch manure,
 program a computer, cook a tasty meal, fight efficiently, die gallantly.
 Specialization is for insects.  -- Robert Heinlein


 
 agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed:

Re: [agi] Occam's Razor and its abuse

2008-10-28 Thread Abram Demski

Ben,

You assert that Pei is forced to make an assumption about the
regulatiry of the world to justify adaptation. Pei could also take a
different argument. He could try to show that *if* a strategy exists
that can be implemented given the finite resources, NARS will
eventually find it. Thus, adaptation is justified on a sort of we
might as well try basis. (The proof would involve showing that NARS
searches the state of finite-state-machines that can be implemented
with the resources at hand, and is more probable to stay for longer
periods of time in configurations that give more reward, such that
NARS would eventually settle on a configuration if that configuration
consistently gave the highest reward.)

So, some form of learning can take place with no assumptions. The
problem is that the search space is exponential in the resources
available, so there is some maximum point where the system would
perform best (because the amount of resources match the problem), but
giving the system more resources would hurt performance (because the
system searches the unnecessarily large search space). So, in this
sense, the system's behavior seems counterintuitive-- it does not seem
to be taking advantage of the increased resources.

I'm not claiming NARS would have that problem, of course just that
a theoretical no-assumption learner would.

--Abram

On Tue, Oct 28, 2008 at 2:12 PM, Ben Goertzel [EMAIL PROTECTED] wrote:


 On Tue, Oct 28, 2008 at 10:00 AM, Pei Wang [EMAIL PROTECTED] wrote:

 Ben,

 Thanks. So the other people now see that I'm not attacking a straw man.

 My solution to Hume's problem, as embedded in the experience-grounded
 semantics, is to assume no predictability, but to justify induction as
 adaptation. However, it is a separate topic which I've explained in my
 other publications.

 Right, but justifying induction as adaptation only works if the environment
 is assumed to have certain regularities which can be adapted to.  In a
 random environment, adaptation won't work.  So, still, to justify induction
 as adaptation you have to make *some* assumptions about the world.

 The Occam prior gives one such assumption: that (to give just one form) sets
 of observations in the world tend to be producible by short computer
 programs.

 For adaptation to successfully carry out induction, *some* vaguely
 comparable property to this must hold, and I'm not sure if you have
 articulated which one you assume, or if you leave this open.

 In effect, you implicitly assume something like an Occam prior, because
 you're saying that  a system with finite resources can successfully adapt to
 the world ... which means that sets of observations in the world *must* be
 approximately summarizable via subprograms that can be executed within this
 system.

 So I argue that, even though it's not your preferred way to think about it,
 your own approach to AI theory and practice implicitly assumes some variant
 of the Occam prior holds in the real world.


 Here I just want to point out that the original and basic meaning of
 Occam's Razor and those two common (mis)usages of it are not
 necessarily the same. I fully agree with the former, but not the
 latter, and I haven't seen any convincing justification of the latter.
 Instead, they are often taken as granted, under the name of Occam's
 Razor.

 I agree that the notion of an Occam prior is a significant conceptual beyond
 the original Occam's Razor precept enounced long ago.

 Also, I note that, for those who posit the Occam prior as a **prior
 assumption**, there is not supposed to be any convincing justification for
 it.  The idea is simply that: one must make *some* assumption (explicitly or
 implicitly) if one wants to do induction, and this is the assumption that
 some people choose to make.

 -- Ben G



 
 agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com

Re: [agi] Occam's Razor and its abuse

2008-10-28 Thread Pei Wang

Ben,

It seems that you agree the issue I pointed out really exists, but
just take it as a necessary evil. Furthermore, you think I also
assumed the same thing, though I failed to see it. I won't argue
against the necessary evil part --- as far as you agree that those
postulates (such as the universe is computable) are not
convincingly justified. I won't try to disprove them.

As for the latter part, I don't think you can convince me that you
know me better than I know myself. ;-)

The following is from
http://nars.wang.googlepages.com/wang.semantics.pdf , page 28:

If the answers provided by NARS are fallible, in what sense these answers are
better than arbitrary guesses? This leads us to the concept of rationality.
When infallible predictions cannot be obtained (due to insufficient knowledge
and resources), answers based on past experience are better than arbitrary
guesses, if the environment is relatively stable. To say an answer is only a
summary of past experience (thus no future confirmation guaranteed) does
not make it equal to an arbitrary conclusion — it is what adaptation means.
Adaptation is the process in which a system changes its behaviors as if the
future is similar to the past. It is a rational process, even though individual
conclusions it produces are often wrong. For this reason, valid inference rules
(deduction, induction, abduction, and so on) are the ones whose conclusions
correctly (according to the semantics) summarize the evidence in the premises.
They are truth-preserving in this sense, not in the model-theoretic sense that
they always generate conclusions which are immune from future revision.

--- so you see, I don't assume adaptation will always be successful,
even successful to a certain probability. You can dislike this
conclusion, though you cannot say it is the same as what is assumed by
Novamente and AIXI.

Pei

On Tue, Oct 28, 2008 at 2:12 PM, Ben Goertzel [EMAIL PROTECTED] wrote:


 On Tue, Oct 28, 2008 at 10:00 AM, Pei Wang [EMAIL PROTECTED] wrote:

 Ben,

 Thanks. So the other people now see that I'm not attacking a straw man.

 My solution to Hume's problem, as embedded in the experience-grounded
 semantics, is to assume no predictability, but to justify induction as
 adaptation. However, it is a separate topic which I've explained in my
 other publications.

 Right, but justifying induction as adaptation only works if the environment
 is assumed to have certain regularities which can be adapted to.  In a
 random environment, adaptation won't work.  So, still, to justify induction
 as adaptation you have to make *some* assumptions about the world.

 The Occam prior gives one such assumption: that (to give just one form) sets
 of observations in the world tend to be producible by short computer
 programs.

 For adaptation to successfully carry out induction, *some* vaguely
 comparable property to this must hold, and I'm not sure if you have
 articulated which one you assume, or if you leave this open.

 In effect, you implicitly assume something like an Occam prior, because
 you're saying that  a system with finite resources can successfully adapt to
 the world ... which means that sets of observations in the world *must* be
 approximately summarizable via subprograms that can be executed within this
 system.

 So I argue that, even though it's not your preferred way to think about it,
 your own approach to AI theory and practice implicitly assumes some variant
 of the Occam prior holds in the real world.


 Here I just want to point out that the original and basic meaning of
 Occam's Razor and those two common (mis)usages of it are not
 necessarily the same. I fully agree with the former, but not the
 latter, and I haven't seen any convincing justification of the latter.
 Instead, they are often taken as granted, under the name of Occam's
 Razor.

 I agree that the notion of an Occam prior is a significant conceptual beyond
 the original Occam's Razor precept enounced long ago.

 Also, I note that, for those who posit the Occam prior as a **prior
 assumption**, there is not supposed to be any convincing justification for
 it.  The idea is simply that: one must make *some* assumption (explicitly or
 implicitly) if one wants to do induction, and this is the assumption that
 some people choose to make.

 -- Ben G



 
 agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com

Re: [agi] Occam's Razor and its abuse

2008-10-28 Thread Ben Goertzel

Most certainly ... and the human mind seems to make a lot of other, more
specialized assumptions about the environment also ... so that unless the
environment satisfies a bunch of these other more specialized assumptions,
its adaptation will be very slow and resource-inefficient...

ben g

On Tue, Oct 28, 2008 at 12:05 PM, Pei Wang [EMAIL PROTECTED] wrote:

 We can say the same thing for the human mind, right?

 Pei

 On Tue, Oct 28, 2008 at 2:54 PM, Ben Goertzel [EMAIL PROTECTED] wrote:
 
  Sure ... but my point is that unless the environment satisfies a certain
  Occam-prior-like property, NARS will be useless...
 
  ben
 
  On Tue, Oct 28, 2008 at 11:52 AM, Abram Demski [EMAIL PROTECTED]
  wrote:
 
  Ben,
 
  You assert that Pei is forced to make an assumption about the
  regulatiry of the world to justify adaptation. Pei could also take a
  different argument. He could try to show that *if* a strategy exists
  that can be implemented given the finite resources, NARS will
  eventually find it. Thus, adaptation is justified on a sort of we
  might as well try basis. (The proof would involve showing that NARS
  searches the state of finite-state-machines that can be implemented
  with the resources at hand, and is more probable to stay for longer
  periods of time in configurations that give more reward, such that
  NARS would eventually settle on a configuration if that configuration
  consistently gave the highest reward.)
 
  So, some form of learning can take place with no assumptions. The
  problem is that the search space is exponential in the resources
  available, so there is some maximum point where the system would
  perform best (because the amount of resources match the problem), but
  giving the system more resources would hurt performance (because the
  system searches the unnecessarily large search space). So, in this
  sense, the system's behavior seems counterintuitive-- it does not seem
  to be taking advantage of the increased resources.
 
  I'm not claiming NARS would have that problem, of course just that
  a theoretical no-assumption learner would.
 
  --Abram
 
  On Tue, Oct 28, 2008 at 2:12 PM, Ben Goertzel [EMAIL PROTECTED] wrote:
  
  
   On Tue, Oct 28, 2008 at 10:00 AM, Pei Wang [EMAIL PROTECTED]
   wrote:
  
   Ben,
  
   Thanks. So the other people now see that I'm not attacking a straw
 man.
  
   My solution to Hume's problem, as embedded in the experience-grounded
   semantics, is to assume no predictability, but to justify induction
 as
   adaptation. However, it is a separate topic which I've explained in
 my
   other publications.
  
   Right, but justifying induction as adaptation only works if the
   environment
   is assumed to have certain regularities which can be adapted to.  In a
   random environment, adaptation won't work.  So, still, to justify
   induction
   as adaptation you have to make *some* assumptions about the world.
  
   The Occam prior gives one such assumption: that (to give just one
 form)
   sets
   of observations in the world tend to be producible by short computer
   programs.
  
   For adaptation to successfully carry out induction, *some* vaguely
   comparable property to this must hold, and I'm not sure if you have
   articulated which one you assume, or if you leave this open.
  
   In effect, you implicitly assume something like an Occam prior,
 because
   you're saying that  a system with finite resources can successfully
   adapt to
   the world ... which means that sets of observations in the world
 *must*
   be
   approximately summarizable via subprograms that can be executed within
   this
   system.
  
   So I argue that, even though it's not your preferred way to think
 about
   it,
   your own approach to AI theory and practice implicitly assumes some
   variant
   of the Occam prior holds in the real world.
  
  
   Here I just want to point out that the original and basic meaning of
   Occam's Razor and those two common (mis)usages of it are not
   necessarily the same. I fully agree with the former, but not the
   latter, and I haven't seen any convincing justification of the
 latter.
   Instead, they are often taken as granted, under the name of Occam's
   Razor.
  
   I agree that the notion of an Occam prior is a significant conceptual
   beyond
   the original Occam's Razor precept enounced long ago.
  
   Also, I note that, for those who posit the Occam prior as a **prior
   assumption**, there is not supposed to be any convincing justification
   for
   it.  The idea is simply that: one must make *some* assumption
   (explicitly or
   implicitly) if one wants to do induction, and this is the assumption
   that
   some people choose to make.
  
   -- Ben G
  
  
  
   
   agi | Archives | Modify Your Subscription
 
 
  ---
  agi
  Archives: https://www.listbox.com/member/archive/303/=now
  RSS Feed:

Re: [agi] Occam's Razor and its abuse

2008-10-28 Thread Pei Wang

We can say the same thing for the human mind, right?

Pei

On Tue, Oct 28, 2008 at 2:54 PM, Ben Goertzel [EMAIL PROTECTED] wrote:

 Sure ... but my point is that unless the environment satisfies a certain
 Occam-prior-like property, NARS will be useless...

 ben

 On Tue, Oct 28, 2008 at 11:52 AM, Abram Demski [EMAIL PROTECTED]
 wrote:

 Ben,

 You assert that Pei is forced to make an assumption about the
 regulatiry of the world to justify adaptation. Pei could also take a
 different argument. He could try to show that *if* a strategy exists
 that can be implemented given the finite resources, NARS will
 eventually find it. Thus, adaptation is justified on a sort of we
 might as well try basis. (The proof would involve showing that NARS
 searches the state of finite-state-machines that can be implemented
 with the resources at hand, and is more probable to stay for longer
 periods of time in configurations that give more reward, such that
 NARS would eventually settle on a configuration if that configuration
 consistently gave the highest reward.)

 So, some form of learning can take place with no assumptions. The
 problem is that the search space is exponential in the resources
 available, so there is some maximum point where the system would
 perform best (because the amount of resources match the problem), but
 giving the system more resources would hurt performance (because the
 system searches the unnecessarily large search space). So, in this
 sense, the system's behavior seems counterintuitive-- it does not seem
 to be taking advantage of the increased resources.

 I'm not claiming NARS would have that problem, of course just that
 a theoretical no-assumption learner would.

 --Abram

 On Tue, Oct 28, 2008 at 2:12 PM, Ben Goertzel [EMAIL PROTECTED] wrote:
 
 
  On Tue, Oct 28, 2008 at 10:00 AM, Pei Wang [EMAIL PROTECTED]
  wrote:
 
  Ben,
 
  Thanks. So the other people now see that I'm not attacking a straw man.
 
  My solution to Hume's problem, as embedded in the experience-grounded
  semantics, is to assume no predictability, but to justify induction as
  adaptation. However, it is a separate topic which I've explained in my
  other publications.
 
  Right, but justifying induction as adaptation only works if the
  environment
  is assumed to have certain regularities which can be adapted to.  In a
  random environment, adaptation won't work.  So, still, to justify
  induction
  as adaptation you have to make *some* assumptions about the world.
 
  The Occam prior gives one such assumption: that (to give just one form)
  sets
  of observations in the world tend to be producible by short computer
  programs.
 
  For adaptation to successfully carry out induction, *some* vaguely
  comparable property to this must hold, and I'm not sure if you have
  articulated which one you assume, or if you leave this open.
 
  In effect, you implicitly assume something like an Occam prior, because
  you're saying that  a system with finite resources can successfully
  adapt to
  the world ... which means that sets of observations in the world *must*
  be
  approximately summarizable via subprograms that can be executed within
  this
  system.
 
  So I argue that, even though it's not your preferred way to think about
  it,
  your own approach to AI theory and practice implicitly assumes some
  variant
  of the Occam prior holds in the real world.
 
 
  Here I just want to point out that the original and basic meaning of
  Occam's Razor and those two common (mis)usages of it are not
  necessarily the same. I fully agree with the former, but not the
  latter, and I haven't seen any convincing justification of the latter.
  Instead, they are often taken as granted, under the name of Occam's
  Razor.
 
  I agree that the notion of an Occam prior is a significant conceptual
  beyond
  the original Occam's Razor precept enounced long ago.
 
  Also, I note that, for those who posit the Occam prior as a **prior
  assumption**, there is not supposed to be any convincing justification
  for
  it.  The idea is simply that: one must make *some* assumption
  (explicitly or
  implicitly) if one wants to do induction, and this is the assumption
  that
  some people choose to make.
 
  -- Ben G
 
 
 
  
  agi | Archives | Modify Your Subscription


 ---
 agi
 Archives: https://www.listbox.com/member/archive/303/=now
 RSS Feed: https://www.listbox.com/member/archive/rss/303/
 Modify Your Subscription: https://www.listbox.com/member/?;
 Powered by Listbox: http://www.listbox.com



 --
 Ben Goertzel, PhD
 CEO, Novamente LLC and Biomind LLC
 Director of Research, SIAI
 [EMAIL PROTECTED]

 A human being should be able to change a diaper, plan an invasion, butcher
 a hog, conn a ship, design a building, write a sonnet, balance accounts,
 build a wall, set a bone, comfort the dying, take orders, give orders,
 cooperate, act alone, solve equations, analyze

Re: [agi] Occam's Razor and its abuse

2008-10-28 Thread Pei Wang

Abram,

I agree with your basic idea in the following, though I usually put it
in different form.

Pei

On Tue, Oct 28, 2008 at 2:52 PM, Abram Demski [EMAIL PROTECTED] wrote:
 Ben,

 You assert that Pei is forced to make an assumption about the
 regulatiry of the world to justify adaptation. Pei could also take a
 different argument. He could try to show that *if* a strategy exists
 that can be implemented given the finite resources, NARS will
 eventually find it. Thus, adaptation is justified on a sort of we
 might as well try basis. (The proof would involve showing that NARS
 searches the state of finite-state-machines that can be implemented
 with the resources at hand, and is more probable to stay for longer
 periods of time in configurations that give more reward, such that
 NARS would eventually settle on a configuration if that configuration
 consistently gave the highest reward.)

 So, some form of learning can take place with no assumptions. The
 problem is that the search space is exponential in the resources
 available, so there is some maximum point where the system would
 perform best (because the amount of resources match the problem), but
 giving the system more resources would hurt performance (because the
 system searches the unnecessarily large search space). So, in this
 sense, the system's behavior seems counterintuitive-- it does not seem
 to be taking advantage of the increased resources.

 I'm not claiming NARS would have that problem, of course just that
 a theoretical no-assumption learner would.

 --Abram

 On Tue, Oct 28, 2008 at 2:12 PM, Ben Goertzel [EMAIL PROTECTED] wrote:


 On Tue, Oct 28, 2008 at 10:00 AM, Pei Wang [EMAIL PROTECTED] wrote:

 Ben,

 Thanks. So the other people now see that I'm not attacking a straw man.

 My solution to Hume's problem, as embedded in the experience-grounded
 semantics, is to assume no predictability, but to justify induction as
 adaptation. However, it is a separate topic which I've explained in my
 other publications.

 Right, but justifying induction as adaptation only works if the environment
 is assumed to have certain regularities which can be adapted to.  In a
 random environment, adaptation won't work.  So, still, to justify induction
 as adaptation you have to make *some* assumptions about the world.

 The Occam prior gives one such assumption: that (to give just one form) sets
 of observations in the world tend to be producible by short computer
 programs.

 For adaptation to successfully carry out induction, *some* vaguely
 comparable property to this must hold, and I'm not sure if you have
 articulated which one you assume, or if you leave this open.

 In effect, you implicitly assume something like an Occam prior, because
 you're saying that  a system with finite resources can successfully adapt to
 the world ... which means that sets of observations in the world *must* be
 approximately summarizable via subprograms that can be executed within this
 system.

 So I argue that, even though it's not your preferred way to think about it,
 your own approach to AI theory and practice implicitly assumes some variant
 of the Occam prior holds in the real world.


 Here I just want to point out that the original and basic meaning of
 Occam's Razor and those two common (mis)usages of it are not
 necessarily the same. I fully agree with the former, but not the
 latter, and I haven't seen any convincing justification of the latter.
 Instead, they are often taken as granted, under the name of Occam's
 Razor.

 I agree that the notion of an Occam prior is a significant conceptual beyond
 the original Occam's Razor precept enounced long ago.

 Also, I note that, for those who posit the Occam prior as a **prior
 assumption**, there is not supposed to be any convincing justification for
 it.  The idea is simply that: one must make *some* assumption (explicitly or
 implicitly) if one wants to do induction, and this is the assumption that
 some people choose to make.

 -- Ben G



 
 agi | Archives | Modify Your Subscription


 ---
 agi
 Archives: https://www.listbox.com/member/archive/303/=now
 RSS Feed: https://www.listbox.com/member/archive/rss/303/
 Modify Your Subscription: https://www.listbox.com/member/?;
 Powered by Listbox: http://www.listbox.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com

Re: [agi] Occam's Razor and its abuse

2008-10-28 Thread William Pearson

2008/10/28 Ben Goertzel [EMAIL PROTECTED]:

 On the other hand, I just want to point out that to get around Hume's
 complaint you do need to make *some* kind of assumption about the regularity
 of the world.  What kind of assumption of this nature underlies your work on
 NARS (if any)?

Not directed to me, but my take on this interesting question. The
initial architecture would have limited assumptions about the world.
Then the programming in the architecture would for the bias.

Initially the system would divide up the world into the simple
(inanimate) and highly complex (animate). Why should the system expect
animate things to be complex? Because it applies the intentional
stance and thinks that they are optimal problem solvers. Optimal
problems solvers in a social environment tend to high complexity, as
there is an arms race as to who can predict the others, but not be
predicted and exploited by the others.

Thinking, there are other things like me out here, when you are a
complex entity entails thinking things are complex, even when there
might be simpler explanations. E.g. what causes weather.

  Will Pearson


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com

Re: [agi] Occam's Razor and its abuse

2008-10-28 Thread Ben Goertzel

What Hutter proved is (very roughly) that given massive computational
resources, following Occam's Razor will be -- within some possibly quite
large constant -- the best way to achieve goals in a computable
environment...

That's not exactly proving Occam's Razor, though it is a proof related to
Occam's Razor...

One could easily argue it is totally irrelevant to AI due to its assumption
of massive computational resources

ben g

On Tue, Oct 28, 2008 at 2:23 PM, Matt Mahoney [EMAIL PROTECTED] wrote:

 Hutter proved Occam's Razor (AIXI) for the case of any environment with a
 computable probability distribution. It applies to us because the observable
 universe is Turing computable according to currently known laws of physics.
 Specifically, the observable universe has a finite description length
 (approximately 2.91 x 10^122 bits, the Bekenstein bound of the Hubble
 radius).

 AIXI has nothing to do with insufficiency of resources. Given unlimited
 resources we would still prefer the (algorithmically) simplest explanation
 because it is the most likely under a Solomonoff distribution of possible
 environments.

 Also, AIXI does not state the simplest answer is the best answer. It says
 that the simplest answer consistent with observation so far is the best
 answer. When we are short on resources (and we always are because AIXI is
 not computable), then we may choose a different explanation than the
 simplest one. However this does not make the alternative correct.

 -- Matt Mahoney, [EMAIL PROTECTED]


 --- On Tue, 10/28/08, Pei Wang [EMAIL PROTECTED] wrote:

  From: Pei Wang [EMAIL PROTECTED]
  Subject: [agi] Occam's Razor and its abuse
  To: agi@v2.listbox.com
  Date: Tuesday, October 28, 2008, 11:58 AM
  Triggered by several recent discussions, I'd like to
  make the
  following position statement, though won't commit
  myself to long
  debate on it. ;-)
 
  Occam's Razor, in its original form, goes like
  entities must not be
  multiplied beyond necessity, and it is often stated
  as All other
  things being equal, the simplest solution is the best
  or when
  multiple competing theories are equal in other respects,
  the principle
  recommends selecting the theory that introduces the fewest
  assumptions
  and postulates the fewest entities --- all from
  http://en.wikipedia.org/wiki/Occam's_razorhttp://en.wikipedia.org/wiki/Occam%27s_razor
 
  I fully agree with all of the above statements.
 
  However, to me, there are two common misunderstandings
  associated with
  it in the context of AGI and philosophy of science.
 
  (1) To take this statement as self-evident or a stand-alone
  postulate
 
  To me, it is derived or implied by the insufficiency of
  resources. If
  a system has sufficient resources, it has no good reason to
  prefer a
  simpler theory.
 
  (2) To take it to mean The simplest answer is usually
  the correct answer.
 
  This is a very different statement, which cannot be
  justified either
  analytically or empirically.  When theory A is an
  approximation of
  theory B, usually the former is simpler than the latter,
  but less
  correct or accurate, in terms of
  its relation with all available
  evidence. When we are short in resources and have a low
  demand on
  accuracy, we often prefer A over B, but it does not mean
  that by doing
  so we judge A as more correct than B.
 
  In summary, in choosing among alternative theories or
  conclusions, the
  preference for simplicity comes from shortage of resources,
  though
  simplicity and correctness are logically independent of
  each other.
 
  Pei



 ---
 agi
 Archives: https://www.listbox.com/member/archive/303/=now
 RSS Feed: https://www.listbox.com/member/archive/rss/303/
 Modify Your Subscription:
 https://www.listbox.com/member/?;
 Powered by Listbox: http://www.listbox.com




-- 
Ben Goertzel, PhD
CEO, Novamente LLC and Biomind LLC
Director of Research, SIAI
[EMAIL PROTECTED]

A human being should be able to change a diaper, plan an invasion, butcher
a hog, conn a ship, design a building, write a sonnet, balance accounts,
build a wall, set a bone, comfort the dying, take orders, give orders,
cooperate, act alone, solve equations, analyze a new problem, pitch manure,
program a computer, cook a tasty meal, fight efficiently, die gallantly.
Specialization is for insects.  -- Robert Heinlein



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com

Re: [agi] Occam's Razor and its abuse

2008-10-28 Thread Ben Goertzel

Au contraire, I suspect that the fact that biological organisms grow
via the same sorts of processes as the biological environment in which
the live, causes the organisms' minds to be built with **a lot** of implicit
bias that is useful for surviving in the environment...

Some have argued that this kind of bias is **all you need** for evolution...
see Evolution without Selection by A. Lima de Faria.  I think that is
wrong, but it's interesting that there's enough evidence to even try to
make the argument...

ben g

On Tue, Oct 28, 2008 at 2:37 PM, Ed Porter [EMAIL PROTECTED] wrote:

 It appears to me that the assumptions about initial priors used by a self
 learning AGI or an evolutionary line of AGI's could be quite minimal.

 My understanding is that once a probability distribution starts receiving
 random samples from its distribution the effect of the original prior
 becomes rapidly lost, unless it is a rather rare one.  Such rare problem
 priors would get selected against quickly by evolution.  Evolution would
 tend to tune for the most appropriate priors for the success of subsequent
 generations (either or computing in the same system if it is capable of
 enough change or of descendant systems).  Probably the best priors would
 generally be ones that could be trained moderately rapidly by data.

 So it seems an evolutionary system or line could initially learn priors
 without any assumptions for priors other than a random picking of priors.
 Over time and multiple generations it might develop hereditary priors, an
 perhaps even different hereditary priors for parts of its network connected
 to different inputs, outputs or internal controls.

 The use of priors in an AGI could be greatly improved by having a gen/comp
 hiearachy in which models for a given concept could be inherited from the
 priors of sets of models for similar concepts, and that the set of priors
 appropriate could change contextually.  It would also seem that the notion
 of a prior could be improve by blending information from episodic and
 probabilistic models.

 It would appear than in almost any generally intelligent system, being able
 to approximate reality in a manner sufficient for evolutionary success with
 the most efficient representations would be a characteristic that would be
 greatly preferred by evolution, because it would allow systems to better
 model more of their environement sufficiently well for evolutionary success
 with whatever current modeling capacity they have.

 So, although a completely accurate description of virtually anything may
 not
 find much use for Occam's Razor, as a practically useful representation it
 often will.  It seems to me that Occam's Razor is more oriented to deriving
 meaningful generalizations that it is exact descriptions of anything.

 Furthermore, it would seem to me that a more simple set of preconditions,
 is
 generally more probable than a more complex one, because it requires less
 coincidence.  It would seem to me this would be true under most random sets
 of priors for the probabilities of the possible sets of components involved
 and Occam's Razor type selection.

 The are the musings of an untrained mind, since I have not spent much time
 studying philosophy, because such a high percent of it was so obviously
 stupid (such as what was commonly said when I was young, that you can't
 have
 intelligence without language) and my understanding of math is much less
 than that of many on this list.  But none the less I think much of what I
 have said above is true.

 I think its gist is not totally dissimilar to what Abram has said.

 Ed Porter




 -Original Message-
 From: Pei Wang [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, October 28, 2008 3:05 PM
 To: agi@v2.listbox.com
 Subject: Re: [agi] Occam's Razor and its abuse


 Abram,

 I agree with your basic idea in the following, though I usually put it in
 different form.

 Pei

 On Tue, Oct 28, 2008 at 2:52 PM, Abram Demski [EMAIL PROTECTED]
 wrote:
  Ben,
 
  You assert that Pei is forced to make an assumption about the
  regulatiry of the world to justify adaptation. Pei could also take a
  different argument. He could try to show that *if* a strategy exists
  that can be implemented given the finite resources, NARS will
  eventually find it. Thus, adaptation is justified on a sort of we
  might as well try basis. (The proof would involve showing that NARS
  searches the state of finite-state-machines that can be implemented
  with the resources at hand, and is more probable to stay for longer
  periods of time in configurations that give more reward, such that
  NARS would eventually settle on a configuration if that configuration
  consistently gave the highest reward.)
 
  So, some form of learning can take place with no assumptions. The
  problem is that the search space is exponential in the resources
  available, so there is some maximum point where the system would
  perform best (because the amount

Re: [agi] Occam's Razor and its abuse

2008-10-28 Thread Pei Wang

Matt,

The currently known laws of physics is a *description* of the
universe at a certain level, which is fundamentally different from the
universe itself. Also, All human knowledge can be reduced into
physics is not a view point accepted by everyone.

Furthermore, computable is a property of a mathematical function. It
takes a bunch of assumptions to be applied to a statement, and some
additional ones to be applied to an object --- Is the Earth
computable? Does the previous question ever make sense?

Whenever someone prove something outside mathematics, it is always
based on certain assumptions. If the assumptions are not well
justified, there is no strong reason for people to accept the
conclusion, even though the proof process is correct.

Pei

On Tue, Oct 28, 2008 at 5:23 PM, Matt Mahoney [EMAIL PROTECTED] wrote:
 Hutter proved Occam's Razor (AIXI) for the case of any environment with a 
 computable probability distribution. It applies to us because the observable 
 universe is Turing computable according to currently known laws of physics. 
 Specifically, the observable universe has a finite description length 
 (approximately 2.91 x 10^122 bits, the Bekenstein bound of the Hubble radius).

 AIXI has nothing to do with insufficiency of resources. Given unlimited 
 resources we would still prefer the (algorithmically) simplest explanation 
 because it is the most likely under a Solomonoff distribution of possible 
 environments.

 Also, AIXI does not state the simplest answer is the best answer. It says 
 that the simplest answer consistent with observation so far is the best 
 answer. When we are short on resources (and we always are because AIXI is not 
 computable), then we may choose a different explanation than the simplest 
 one. However this does not make the alternative correct.

 -- Matt Mahoney, [EMAIL PROTECTED]


 --- On Tue, 10/28/08, Pei Wang [EMAIL PROTECTED] wrote:

 From: Pei Wang [EMAIL PROTECTED]
 Subject: [agi] Occam's Razor and its abuse
 To: agi@v2.listbox.com
 Date: Tuesday, October 28, 2008, 11:58 AM
 Triggered by several recent discussions, I'd like to
 make the
 following position statement, though won't commit
 myself to long
 debate on it. ;-)

 Occam's Razor, in its original form, goes like
 entities must not be
 multiplied beyond necessity, and it is often stated
 as All other
 things being equal, the simplest solution is the best
 or when
 multiple competing theories are equal in other respects,
 the principle
 recommends selecting the theory that introduces the fewest
 assumptions
 and postulates the fewest entities --- all from
 http://en.wikipedia.org/wiki/Occam's_razor

 I fully agree with all of the above statements.

 However, to me, there are two common misunderstandings
 associated with
 it in the context of AGI and philosophy of science.

 (1) To take this statement as self-evident or a stand-alone
 postulate

 To me, it is derived or implied by the insufficiency of
 resources. If
 a system has sufficient resources, it has no good reason to
 prefer a
 simpler theory.

 (2) To take it to mean The simplest answer is usually
 the correct answer.

 This is a very different statement, which cannot be
 justified either
 analytically or empirically.  When theory A is an
 approximation of
 theory B, usually the former is simpler than the latter,
 but less
 correct or accurate, in terms of
 its relation with all available
 evidence. When we are short in resources and have a low
 demand on
 accuracy, we often prefer A over B, but it does not mean
 that by doing
 so we judge A as more correct than B.

 In summary, in choosing among alternative theories or
 conclusions, the
 preference for simplicity comes from shortage of resources,
 though
 simplicity and correctness are logically independent of
 each other.

 Pei



 ---
 agi
 Archives: https://www.listbox.com/member/archive/303/=now
 RSS Feed: https://www.listbox.com/member/archive/rss/303/
 Modify Your Subscription: https://www.listbox.com/member/?;
 Powered by Listbox: http://www.listbox.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com

Re: [agi] Occam's Razor and its abuse

2008-10-28 Thread Matt Mahoney

--- On Tue, 10/28/08, Ben Goertzel [EMAIL PROTECTED] wrote:
 What Hutter proved is (very roughly) that given massive computational 
 resources, following Occam's Razor will be -- within some possibly quite 
 large constant -- the best way to achieve goals in a computable environment...

 That's not exactly proving Occam's Razor, though it is a proof related to 
 Occam's Razor...

No, that's AIXI^tl. I was talking about AIXI. Hutter proved both.

 One could easily argue it is totally irrelevant to AI due to its assumption 
 of massive computational resources

If you mean AIXI^tl, I agree. However, it is AIXI that proves Occam's Razor. 
AIXI is useful to AGI exactly because it proves noncomputability. We can stop 
looking for a neat solution.

-- Matt Mahoney, [EMAIL PROTECTED]



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com

[agi] Occam's Razor and its abuse

2008-10-28 Thread Eric Baum


Pei Triggered by several recent discussions, I'd like to make the
Pei following position statement, though won't commit myself to long
Pei debate on it. ;-)

Pei Occam's Razor, in its original form, goes like entities must not
Pei be multiplied beyond necessity, and it is often stated as All
Pei other things being equal, the simplest solution is the best or
Pei when multiple competing theories are equal in other respects,
Pei the principle recommends selecting the theory that introduces the
Pei fewest assumptions and postulates the fewest entities --- all
Pei from http://en.wikipedia.org/wiki/Occam's_razor

Pei I fully agree with all of the above statements.

Pei However, to me, there are two common misunderstandings associated
Pei with it in the context of AGI and philosophy of science.

Pei (1) To take this statement as self-evident or a stand-alone
Pei postulate

Pei To me, it is derived or implied by the insufficiency of
Pei resources. If a system has sufficient resources, it has no good
Pei reason to prefer a simpler theory.

With all due respect, this is mistaken. 
Occam's Razor, in some form, is the heart of Generalization, which
is the essence (and G) of GI.

For example, if you study concept learning from examples,
say in the PAC learning context (related theorems
hold in some other contexts as well), 
there are theorems to the effect that if you find
a hypothesis from a simple enough class of a hypotheses
it will with very high probability accurately classify new 
examples chosen from the same distribution, 

and conversely theorems that state (roughly speaking) that
any method that chooses a hypothesis from too expressive a class
of hypotheses will have a probability that can be bounded below
by some reasonable number like 1/7,
of having large error in its predictions on new examples--
in other words it is impossible to PAC learn without respecting
Occam's Razor.

For discussion of the above paragraphs, I'd refer you to
Chapter 4 of What is Thought? (MIT Press, 2004).

In other words, if you are building some system that learns
about the world, it had better respect Occam's razor if you
want whatever it learns to apply to new experience. 
(I use the term Occam's razor loosely; using
hypotheses that are highly constrained in ways other than
just being concise may work, but you'd better respect
simplicity broadly defined. See Chap 6 of WIT? for
more discussion of this point.)

The core problem of GI is generalization: you want to be able to
figure out new problems as they come along that you haven't seen
before. In order to do that, you basically must implicitly or
explicitly employ some version
of Occam's Razor, independent of how much resources you have.

In my view, the first and most important question to ask about
any proposal for AGI is, in what way is it going to produce
Occam hypotheses. If you can't answer that, don't bother implementing
a huge system in hopes of capturing your many insights, because
the bigger your implementation gets, the less likely it is to 
get where you want in the end.


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com

RE: [agi] Occam's Razor and its abuse

2008-10-28 Thread Ed Porter

===Below Ben wrote===
I suspect that the fact that biological organisms grow
via the same sorts of processes as the biological environment in which
the live, causes the organisms' minds to be built with **a lot** of implicit
bias that is useful for surviving in the environment...
 
===My Response==
Au Similaire.  That was  one of the points I was trying to make!   And that
arguably supports at least part of what Pei was arguing.
 
I am not arguing it is all you need.  You at least need some mechanism for
exploring at least some subspace space of possible priors, but you don't
need any specific pre-selected set of priors.
 
Ed Porter
 
 
-Original Message-
From: Ben Goertzel [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, October 28, 2008 5:50 PM
To: agi@v2.listbox.com
Subject: Re: [agi] Occam's Razor and its abuse




Au contraire, I suspect that the fact that biological organisms grow
via the same sorts of processes as the biological environment in which
the live, causes the organisms' minds to be built with **a lot** of implicit
bias that is useful for surviving in the environment...

Some have argued that this kind of bias is **all you need** for evolution...
see Evolution without Selection by A. Lima de Faria.  I think that is
wrong, but it's interesting that there's enough evidence to even try to
make the argument...

ben g


On Tue, Oct 28, 2008 at 2:37 PM, Ed Porter [EMAIL PROTECTED] wrote:


It appears to me that the assumptions about initial priors used by a self
learning AGI or an evolutionary line of AGI's could be quite minimal.

My understanding is that once a probability distribution starts receiving
random samples from its distribution the effect of the original prior
becomes rapidly lost, unless it is a rather rare one.  Such rare problem
priors would get selected against quickly by evolution.  Evolution would
tend to tune for the most appropriate priors for the success of subsequent
generations (either or computing in the same system if it is capable of
enough change or of descendant systems).  Probably the best priors would
generally be ones that could be trained moderately rapidly by data.

So it seems an evolutionary system or line could initially learn priors
without any assumptions for priors other than a random picking of priors.
Over time and multiple generations it might develop hereditary priors, an
perhaps even different hereditary priors for parts of its network connected
to different inputs, outputs or internal controls.

The use of priors in an AGI could be greatly improved by having a gen/comp
hiearachy in which models for a given concept could be inherited from the
priors of sets of models for similar concepts, and that the set of priors
appropriate could change contextually.  It would also seem that the notion
of a prior could be improve by blending information from episodic and
probabilistic models.

It would appear than in almost any generally intelligent system, being able
to approximate reality in a manner sufficient for evolutionary success with
the most efficient representations would be a characteristic that would be
greatly preferred by evolution, because it would allow systems to better
model more of their environement sufficiently well for evolutionary success
with whatever current modeling capacity they have.

So, although a completely accurate description of virtually anything may not
find much use for Occam's Razor, as a practically useful representation it
often will.  It seems to me that Occam's Razor is more oriented to deriving
meaningful generalizations that it is exact descriptions of anything.

Furthermore, it would seem to me that a more simple set of preconditions, is
generally more probable than a more complex one, because it requires less
coincidence.  It would seem to me this would be true under most random sets
of priors for the probabilities of the possible sets of components involved
and Occam's Razor type selection.

The are the musings of an untrained mind, since I have not spent much time
studying philosophy, because such a high percent of it was so obviously
stupid (such as what was commonly said when I was young, that you can't have
intelligence without language) and my understanding of math is much less
than that of many on this list.  But none the less I think much of what I
have said above is true.

I think its gist is not totally dissimilar to what Abram has said.

Ed Porter





-Original Message-
From: Pei Wang [mailto:[EMAIL PROTECTED]
Sent: Tuesday, October 28, 2008 3:05 PM
To: agi@v2.listbox.com

Subject: Re: [agi] Occam's Razor and its abuse


Abram,

I agree with your basic idea in the following, though I usually put it in
different form.

Pei

On Tue, Oct 28, 2008 at 2:52 PM, Abram Demski [EMAIL PROTECTED] wrote:
 Ben,

 You assert that Pei is forced to make an assumption about the
 regulatiry of the world to justify adaptation. Pei could also take a
 different argument. He could try to show

Re: [agi] Occam's Razor and its abuse

2008-10-28 Thread Mike Tintner


Eric:The core problem of GI is generalization: you want to be able to
figure out new problems as they come along that you haven't seen
before. In order to do that, you basically must implicitly or
explicitly employ some version
of Occam's Razor

It all depends on the subject matter of the generalization. It's a fairly 
good principle, but there is such a thing as simple-mindedness. For example, 
what is the cluster of associations evoked in the human brain by any given 
idea, and what is the principle [or principles] that determines how many 
associations in how many domains and how many brain areas? The answers to 
these questions are unlikely to be simple. IOW if the subject matter is 
complex, the generalization may also have to be complex. 





---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com

Re: [agi] Occam's Razor and its abuse

2008-10-28 Thread Pei Wang

Ed,

Since NARS doesn't follow the Bayesian approach, there is no initial
priors to be assumed. If we use a more general term, such as initial
knowledge or innate beliefs, then yes, you can add them into the
system, will will improve the system's performance. However, they are
optional. In NARS, all object-level (i.e., not meta-level) innate
beliefs can be learned by the system afterward.

Pei

On Tue, Oct 28, 2008 at 5:37 PM, Ed Porter [EMAIL PROTECTED] wrote:
 It appears to me that the assumptions about initial priors used by a self
 learning AGI or an evolutionary line of AGI's could be quite minimal.

 My understanding is that once a probability distribution starts receiving
 random samples from its distribution the effect of the original prior
 becomes rapidly lost, unless it is a rather rare one.  Such rare problem
 priors would get selected against quickly by evolution.  Evolution would
 tend to tune for the most appropriate priors for the success of subsequent
 generations (either or computing in the same system if it is capable of
 enough change or of descendant systems).  Probably the best priors would
 generally be ones that could be trained moderately rapidly by data.

 So it seems an evolutionary system or line could initially learn priors
 without any assumptions for priors other than a random picking of priors.
 Over time and multiple generations it might develop hereditary priors, an
 perhaps even different hereditary priors for parts of its network connected
 to different inputs, outputs or internal controls.

 The use of priors in an AGI could be greatly improved by having a gen/comp
 hiearachy in which models for a given concept could be inherited from the
 priors of sets of models for similar concepts, and that the set of priors
 appropriate could change contextually.  It would also seem that the notion
 of a prior could be improve by blending information from episodic and
 probabilistic models.

 It would appear than in almost any generally intelligent system, being able
 to approximate reality in a manner sufficient for evolutionary success with
 the most efficient representations would be a characteristic that would be
 greatly preferred by evolution, because it would allow systems to better
 model more of their environement sufficiently well for evolutionary success
 with whatever current modeling capacity they have.

 So, although a completely accurate description of virtually anything may not
 find much use for Occam's Razor, as a practically useful representation it
 often will.  It seems to me that Occam's Razor is more oriented to deriving
 meaningful generalizations that it is exact descriptions of anything.

 Furthermore, it would seem to me that a more simple set of preconditions, is
 generally more probable than a more complex one, because it requires less
 coincidence.  It would seem to me this would be true under most random sets
 of priors for the probabilities of the possible sets of components involved
 and Occam's Razor type selection.

 The are the musings of an untrained mind, since I have not spent much time
 studying philosophy, because such a high percent of it was so obviously
 stupid (such as what was commonly said when I was young, that you can't have
 intelligence without language) and my understanding of math is much less
 than that of many on this list.  But none the less I think much of what I
 have said above is true.

 I think its gist is not totally dissimilar to what Abram has said.

 Ed Porter




 -Original Message-
 From: Pei Wang [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, October 28, 2008 3:05 PM
 To: agi@v2.listbox.com
 Subject: Re: [agi] Occam's Razor and its abuse


 Abram,

 I agree with your basic idea in the following, though I usually put it in
 different form.

 Pei

 On Tue, Oct 28, 2008 at 2:52 PM, Abram Demski [EMAIL PROTECTED] wrote:
 Ben,

 You assert that Pei is forced to make an assumption about the
 regulatiry of the world to justify adaptation. Pei could also take a
 different argument. He could try to show that *if* a strategy exists
 that can be implemented given the finite resources, NARS will
 eventually find it. Thus, adaptation is justified on a sort of we
 might as well try basis. (The proof would involve showing that NARS
 searches the state of finite-state-machines that can be implemented
 with the resources at hand, and is more probable to stay for longer
 periods of time in configurations that give more reward, such that
 NARS would eventually settle on a configuration if that configuration
 consistently gave the highest reward.)

 So, some form of learning can take place with no assumptions. The
 problem is that the search space is exponential in the resources
 available, so there is some maximum point where the system would
 perform best (because the amount of resources match the problem), but
 giving the system more resources would hurt performance (because the
 system searches the unnecessarily large search

Re: [agi] Occam's Razor and its abuse

2008-10-28 Thread Pei Wang

Eric,

I highly respect your work, though we clearly have different opinions
on what intelligence is, as well as on how to achieve it. For example,
though learning and generalization play central roles in my theory
about intelligence, I don't think PAC learning (or the other learning
algorithms proposed so far) provides a proper conceptual framework for
the typical situation of this process. Generally speaking, I'm not
building some system that learns about the world, in the sense that
there is a correct way to describe the world waiting to be discovered,
which can be captured by some algorithm. Instead, learning to me is a
non-algorithmic open-ended process by which the system summarizes its
own experience, and uses it to predict the future. I fully understand
that most people in this field probably consider this opinion wrong,
though I haven't been convinced yet by the arguments I've seen so far.

Instead of addressing all of the relevant issues, in this discussion I
have a very limited goal. To rephrase what I said initially, I see
that under the term Occam's Razor, currently there are three
different statements:

(1) Simplicity (in conclusions, hypothesis, theories, etc.) is preferred.

(2) The preference to simplicity does not need a reason or justification.

(3) Simplicity is preferred because it is correlated with correctness.

I agree with (1), but not (2) and (3). I know many people have
different opinions, and I don't attempt to argue with them here ---
these problems are too complicated to be settled by email exchanges.

However, I do hope to convince people in this discussion that the
three statements are not logically equivalent, and (2) and (3) are not
implied by (1), so to use Occam's Razor to refer to all of them is
not a good idea, because it is going to mix different issues.
Therefore, I suggest people to use Occam's Razor in its original and
basic sense, that is (1), and to use other terms to refer to (2) and
(3). Otherwise, when people talk about Occam's Razor, I just don't
know what to say.

Pei

On Tue, Oct 28, 2008 at 8:09 PM, Eric Baum [EMAIL PROTECTED] wrote:

 Pei Triggered by several recent discussions, I'd like to make the
 Pei following position statement, though won't commit myself to long
 Pei debate on it. ;-)

 Pei Occam's Razor, in its original form, goes like entities must not
 Pei be multiplied beyond necessity, and it is often stated as All
 Pei other things being equal, the simplest solution is the best or
 Pei when multiple competing theories are equal in other respects,
 Pei the principle recommends selecting the theory that introduces the
 Pei fewest assumptions and postulates the fewest entities --- all
 Pei from http://en.wikipedia.org/wiki/Occam's_razor

 Pei I fully agree with all of the above statements.

 Pei However, to me, there are two common misunderstandings associated
 Pei with it in the context of AGI and philosophy of science.

 Pei (1) To take this statement as self-evident or a stand-alone
 Pei postulate

 Pei To me, it is derived or implied by the insufficiency of
 Pei resources. If a system has sufficient resources, it has no good
 Pei reason to prefer a simpler theory.

 With all due respect, this is mistaken.
 Occam's Razor, in some form, is the heart of Generalization, which
 is the essence (and G) of GI.

 For example, if you study concept learning from examples,
 say in the PAC learning context (related theorems
 hold in some other contexts as well),
 there are theorems to the effect that if you find
 a hypothesis from a simple enough class of a hypotheses
 it will with very high probability accurately classify new
 examples chosen from the same distribution,

 and conversely theorems that state (roughly speaking) that
 any method that chooses a hypothesis from too expressive a class
 of hypotheses will have a probability that can be bounded below
 by some reasonable number like 1/7,
 of having large error in its predictions on new examples--
 in other words it is impossible to PAC learn without respecting
 Occam's Razor.

 For discussion of the above paragraphs, I'd refer you to
 Chapter 4 of What is Thought? (MIT Press, 2004).

 In other words, if you are building some system that learns
 about the world, it had better respect Occam's razor if you
 want whatever it learns to apply to new experience.
 (I use the term Occam's razor loosely; using
 hypotheses that are highly constrained in ways other than
 just being concise may work, but you'd better respect
 simplicity broadly defined. See Chap 6 of WIT? for
 more discussion of this point.)

 The core problem of GI is generalization: you want to be able to
 figure out new problems as they come along that you haven't seen
 before. In order to do that, you basically must implicitly or
 explicitly employ some version
 of Occam's Razor, independent of how much resources you have.

 In my view, the first and most important question to ask about
 any proposal for AGI is, in what way is it going to

Re: [agi] Occam's Razor and its abuse

2008-10-28 Thread Charles Hixson

If not verify, what about falsify?  To me Occam's Razor has always been 
seen as a tool for selecting the first argument to attempt to falsify.  
If you can't, or haven't, falsified it, then it's usually the best 
assumption to go on (presuming that the costs of failing are evenly 
distributed).


OTOH, Occam's Razor clearly isn't quantitative, and it doesn't always 
pick the right answer, just one that's good enough based on what we 
know at the moment.  (Again presuming evenly distributed costs of failure.)


(And actually that's an oversimplification.  I've been considering the 
costs of applying the presumption of the theory chosen by Occam's Razor 
to be equal to or lower then the costs of the alternatives.  Whoops!  
The simplest workable approach isn't always the cheapest, and given that 
all non-falsified-as-of-now approaches have closely equal 
plausibility...perhaps one should instead choose the cheapest to presume 
of all theories that have been vetted against current knowledge.)


Occam's Razor is fine for it's original purposes, but when you try to 
apply it to practical rather than logical problems then you start 
needing to evaluate relative costs.  Both costs of presuming and costs 
of failure.  And actually often it turns out that a solution based on a 
theory known to be incorrect (e.g. Newton's laws) is good enough, so 
you don't need to decide about the correct answer.  NASA uses Newton, 
not Einstein, even though Einstein might be correct and Newton is known 
to be wrong.


Pei Wang wrote:

Ben,

It seems that you agree the issue I pointed out really exists, but
just take it as a necessary evil. Furthermore, you think I also
assumed the same thing, though I failed to see it. I won't argue
against the necessary evil part --- as far as you agree that those
postulates (such as the universe is computable) are not
convincingly justified. I won't try to disprove them.

As for the latter part, I don't think you can convince me that you
know me better than I know myself. ;-)

The following is from
http://nars.wang.googlepages.com/wang.semantics.pdf , page 28:

If the answers provided by NARS are fallible, in what sense these answers are
better than arbitrary guesses? This leads us to the concept of rationality.
When infallible predictions cannot be obtained (due to insufficient knowledge
and resources), answers based on past experience are better than arbitrary
guesses, if the environment is relatively stable. To say an answer is only a
summary of past experience (thus no future confirmation guaranteed) does
not make it equal to an arbitrary conclusion — it is what adaptation means.
Adaptation is the process in which a system changes its behaviors as if the
future is similar to the past. It is a rational process, even though individual
conclusions it produces are often wrong. For this reason, valid inference rules
(deduction, induction, abduction, and so on) are the ones whose conclusions
correctly (according to the semantics) summarize the evidence in the premises.
They are truth-preserving in this sense, not in the model-theoretic sense that
they always generate conclusions which are immune from future revision.

--- so you see, I don't assume adaptation will always be successful,
even successful to a certain probability. You can dislike this
conclusion, though you cannot say it is the same as what is assumed by
Novamente and AIXI.

Pei

On Tue, Oct 28, 2008 at 2:12 PM, Ben Goertzel [EMAIL PROTECTED] wrote:
  

On Tue, Oct 28, 2008 at 10:00 AM, Pei Wang [EMAIL PROTECTED] wrote:


Ben,

Thanks. So the other people now see that I'm not attacking a straw man.

My solution to Hume's problem, as embedded in the experience-grounded
semantics, is to assume no predictability, but to justify induction as
adaptation. However, it is a separate topic which I've explained in my
other publications.
  

Right, but justifying induction as adaptation only works if the environment
is assumed to have certain regularities which can be adapted to.  In a
random environment, adaptation won't work.  So, still, to justify induction
as adaptation you have to make *some* assumptions about the world.

The Occam prior gives one such assumption: that (to give just one form) sets
of observations in the world tend to be producible by short computer
programs.

For adaptation to successfully carry out induction, *some* vaguely
comparable property to this must hold, and I'm not sure if you have
articulated which one you assume, or if you leave this open.

In effect, you implicitly assume something like an Occam prior, because
you're saying that  a system with finite resources can successfully adapt to
the world ... which means that sets of observations in the world *must* be
approximately summarizable via subprograms that can be executed within this
system.

So I argue that, even though it's not your preferred way to think about it,
your own approach to AI theory and practice implicitly assumes some variant
of the

Re: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

RE: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

[agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

[agi] Occam's Razor and its abuse

RE: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

Re: [agi] Occam's Razor and its abuse

31 matches

Site Navigation

Mail list logo

Footer information