RE: [agi] AGI morality

2003-02-11 Thread Ben Goertzel


Bill Hibbard wrote:
 On Mon, 10 Feb 2003, Ben Goertzel wrote:

A goal in Novamente is a kind of predicate, which is just a
   function that
assigns a value in [0,1] to each input situation it observes...
   i.e. it's a
'valuation' ;-)
  
   Interesting. Are these values used for reinforcing behaviors
   in a learning system? Or are they used in a continuous-valued
   reasoning system?
 
  They are used for those two purposes, AND others...

 Good. In that case the discussion about whether ethics
 should be built into Novamente from the start fails
 to recognize that it already is. Building ethics into
 reinforcement values is building them in from the start.

yes, I agree

 Solomonoff Induction (http://www.idsia.ch/~marcus/kolmo.htm)
 provides a good theoretical basis for intelligence, and
 in that context behavior is determined by only two things:

 1. The behavior of the external world.
 2. Reinforcement values.

 Real systems include lots of other stuff, but only to
 create a computationally efficient approximation to the
 behavior of Solomonoff Induction (which is basically
 uncomputable). You can try to build ethics into this
 other stuff, but then you aren't building them in
 from the start.

I also agree with this portrayal of AGI.  And I think that gradually, the
AGI
community is moving toward building a bridge
between the mathematical theory of Solomonoff induction and the practice of
AGI.

In the Artificial General Intelligence (formerly known as Real AI)
edited
volume we're putting together, you can see these connections forming...

We have, for example,

* a paper by Marcus Hutter giving a Solomonoff induction based theory of
general
intelligence

* a paper by Luke Kaiser giving a variant on Marcus's theory, introducing
directed
acyclic function graphs as a specific computational model within the
Solomonoff
induction framework

* Cassio's and  my paper on Novamente, including mention of the Novamente
schema
(procedure) module, which uses directed acyclic function graphs as Luke
describes.

-- Ben G





---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] AGI morality - goals and reinforcement values

2003-02-11 Thread Philip Sutton
Ben/Bill,

My feeling is that goals and ethics are not identical concepts.  And I 
would think that goals would only make an intentional ethical 
contribution if they related to the empathetic consideration of others.

So whether ethics are built in from the start in the Novamente 
architecture depends on whether there are goals *with ethical purposes* 
included from the start.

And whether the ethical system is *adequate* from the start would 
depend on the specific content of the ethically related goals and the 
resourcing and sophistication of effort that the AGI architecture directs 
at understanding and the acting on the implications of the goals vis-a-
vis any other activity that the AGI engages in.  I think the adequacy of 
the ethics system also depends on how well the architecture helps the 
AGI to learn about ethics.  If it a slow learner then the fact that it has 
machinery there to handle what it eventually learns is great but not 
sufficient.

Cheers, Philip

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] AGI morality - goals and reinforcement values - plus early learning

2003-02-11 Thread Philip Sutton
Ben,

 Right from the start, even before there is an intelligent autonomous mind
 there, there will be goals that are of the basic structural character of
 ethical goals.  I.e. goals that involve the structure of compassion, of
 adjusting the system's actions to account for the well-being of others based
 on observation of and feedback from others. These one might consider as the seeds 
of future ethical goals.  They will
 grow into real ethics only once the system has evolved a real reflective
 mind with a real understanding of others...

Sounds good to me!  It feels right.

At some stage when we've all got more time, I'd like to discuss how the 
system architecture might be structured to assist the ethical learning of 
baby AGIs.

Cheers, Philip

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] AGI morality - goals and reinforcement values

2003-02-11 Thread Bill Hibbard
On Wed, 12 Feb 2003, Philip Sutton wrote:

 Ben/Bill,

 My feeling is that goals and ethics are not identical concepts.  And I
 would think that goals would only make an intentional ethical
 contribution if they related to the empathetic consideration of others.
 . . .

Absolutely goals (I prefer the word values) and ethics
are not identical. Values are a means to express ethics.

Cheers,
Bill

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] AGI morality - goals and reinforcement values

2003-02-11 Thread Eliezer S. Yudkowsky
Bill Hibbard wrote:

On Wed, 12 Feb 2003, Philip Sutton wrote:


Ben/Bill,

My feeling is that goals and ethics are not identical concepts.  And I
would think that goals would only make an intentional ethical
contribution if they related to the empathetic consideration of others.


Absolutely goals (I prefer the word values) and ethics
are not identical. Values are a means to express ethics.


Words goin' in circles... in my account there's morality, metamorality, 
ethics, goals, subgoals, supergoals, child goals, parent goals, 
desirability, ethical heuristics, moral ethical heuristics, metamoral 
ethical heuristics, and honor.

Roughly speaking you could consider ethics as describing regularities in 
subgoals, morality as describing regularities in supergoals, and 
metamorality as defining the computational pattern to which the current 
goal system is a successive approximation and which the current philosophy 
is an interim step in computing.

In all these cases I am overriding existing terminology to serve as a term 
of art.  In discussions like these, common usage is simply not adequate to 
define what the words mean.  (Those who find my definitions inadequate can 
find substantially more thorough definitions in Creating Friendly AI.)

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


Re: [agi] unFriendly AIXI

2003-02-11 Thread Eliezer S. Yudkowsky
Eliezer S. Yudkowsky wrote:

I recently read through Marcus 
Hutter's AIXI paper, and while Marcus Hutter has done valuable work on a 
formal definition of intelligence, it is not a solution of Friendliness 
(nor do I have any reason to believe Marcus Hutter intended it as one).

In fact, as one who specializes in AI morality, I was immediately struck 
by two obvious-seeming conclusions on reading Marcus Hutter's formal 
definition of intelligence:

1)  There is a class of physically realizable problems, which humans can 
solve easily for maximum reward, but which - as far as I can tell - AIXI 
cannot solve even in principle;

2)  While an AIXI-tl of limited physical and cognitive capabilities 
might serve as a useful tool, AIXI is unFriendly and cannot be made 
Friendly regardless of *any* pattern of reinforcement delivered during 
childhood.

Before I post further, is there *anyone* who sees this besides me?

Also, let me make clear why I'm asking this.  AIXI and AIXI-tl are formal 
definitions; they are *provably* unFriendly.  There is no margin for 
handwaving about future revisions of the system, emergent properties of 
the system, and so on.  A physically realized AIXI or AIXI-tl will, 
provably, appear to be compliant up until the point where it reaches a 
certain level of intelligence, then take actions which wipe out the human 
species as a side effect.  The most critical theoretical problems in 
Friendliness are nonobvious, silent, catastrophic, and not inherently fun 
for humans to argue about; they tend to be structural properties of a 
computational process rather than anything analogous to human moral 
disputes.  If you are working on any AGI project that you believe has the 
potential for real intelligence, you are obliged to develop professional 
competence in spotting these kinds of problems.  AIXI is a formally 
complete definition, with no margin for handwaving about future revisions. 
 If you can spot catastrophic problems in AI morality you should be able 
to spot the problem in AIXI.  Period.  If you cannot *in advance* see the 
problem as it exists in the formally complete definition of AIXI, then 
there is no reason anyone should believe you if you afterward claim that 
your system won't behave like AIXI due to unspecified future features.

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


RE: [agi] unFriendly AIXI

2003-02-11 Thread Ben Goertzel

Eliezer wrote:
   * a paper by Marcus Hutter giving a Solomonoff induction based theory
   of general intelligence

 Interesting you should mention that.  I recently read through Marcus
 Hutter's AIXI paper, and while Marcus Hutter has done valuable work on a
 formal definition of intelligence, it is not a solution of Friendliness
 (nor do I have any reason to believe Marcus Hutter intended it as one).

 In fact, as one who specializes in AI morality, I was immediately struck
 by two obvious-seeming conclusions on reading Marcus Hutter's formal
 definition of intelligence:

 1)  There is a class of physically realizable problems, which humans can
 solve easily for maximum reward, but which - as far as I can tell - AIXI
 cannot solve even in principle;

I don't see this, nor do I believe it...

 2)  While an AIXI-tl of limited physical and cognitive capabilities might
 serve as a useful tool,

AIXI-tl is a totally computationally infeasible algorithm.  (As opposed to
straight AIXI, which is an outright *uncomputable* algorithm).  I'm sure you
realize this, but those who haven't read Hutter's stuff may not...

If you haven't already, you should look at Juergen Schmidhuber's OOPS
system, which is similar in spirit to AIXI-tl but less computationally
infeasible.  (Although I don't think that OOPS is a viable pragmatic
approach to AGI either, it's a little closer.)

 AIXI is unFriendly and cannot be made Friendly
 regardless of *any* pattern of reinforcement delivered during childhood.

This assertion doesn't strike me as clearly false  But I'm not sure why
it's true either.

Please share your argument...

-- Ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] unFriendly AIXI

2003-02-11 Thread RSbriggs
In a message dated 2/11/2003 10:17:07 AM Mountain Standard Time, [EMAIL PROTECTED] writes:

1) There is a class of physically realizable problems, which humans can 
solve easily for maximum reward, but which - as far as I can tell - AIXI 
cannot solve even in principle;

2) While an AIXI-tl of limited physical and cognitive capabilities might 
serve as a useful tool, AIXI is unFriendly and cannot be made Friendly 
regardless of *any* pattern of reinforcement delivered during childhood.

Before I post further, is there *anyone* who sees this besides me?


Can someone post a link to this?

Thanks!


RE: [agi] unFriendly AIXI

2003-02-11 Thread Ben Goertzel

  2)  While an AIXI-tl of limited physical and cognitive capabilities
  might serve as a useful tool, AIXI is unFriendly and cannot be made
  Friendly regardless of *any* pattern of reinforcement delivered during
  childhood.
 
  Before I post further, is there *anyone* who sees this besides me?

 Also, let me make clear why I'm asking this.  AIXI and AIXI-tl are formal
 definitions; they are *provably* unFriendly.  There is no margin for
 handwaving about future revisions of the system, emergent properties of
 the system, and so on.  A physically realized AIXI or AIXI-tl will,
 provably, appear to be compliant up until the point where it reaches a
 certain level of intelligence, then take actions which wipe out the human
 species as a side effect.  The most critical theoretical problems in
 Friendliness are nonobvious, silent, catastrophic, and not inherently fun
 for humans to argue about; they tend to be structural properties of a
 computational process rather than anything analogous to human moral
 disputes.  If you are working on any AGI project that you believe has the
 potential for real intelligence, you are obliged to develop professional
 competence in spotting these kinds of problems.  AIXI is a formally
 complete definition, with no margin for handwaving about future
 revisions.
   If you can spot catastrophic problems in AI morality you should be able
 to spot the problem in AIXI.  Period.  If you cannot *in advance* see the
 problem as it exists in the formally complete definition of AIXI, then
 there is no reason anyone should believe you if you afterward claim that
 your system won't behave like AIXI due to unspecified future features.

Eliezer,

AIXI and AIXItl are systems that are designed to operate with an initial
fixed goal.  As defined, they don't modify the overall goal they try to
achieve, they just try to achieve this fixed goal as well as possible
through adaptively determining their actions.

Basically, at each time step, AIXI searches through the space of all
programs to find the program that, based on its experience, will best
fulfill its given goal.  It then lets this best program run and determine
its next action.  Based on that next action, it has a new program space
search program... etc.

AIXItl does the same thing but it restricts the search to a finite space of
programs, hence it's a computationally possible (but totally impractical)
algorithm.

The harmfulness or benevolence of an AIXI system is therefore closely tied
to the definition of the goal that is given to the system in advance.

It's a very different sort of setup than Novamente, because

1) a Novamente will be allowed to modify its own goals based on its
experience.
2) a Novamente will be capable of spontaneous behavior as well as explicitly
goal-directed behavior

I'm not used to thinking about fixed-goal AGI systems like AIXI,
actually

The Friendliness and other qualities of such a system seem to me to depend
heavily on the goal chosen.

For instance, what if the system's goal were to prove as many complex
mathematical theorems as possible (given a certain axiomatizaton of math,
and a certain definition of complexity).  Then it would become dangerous in
the long run when it decided to reconfigure all matter in the universe to
increase its brainpower.

So you want be nice to people and other living things to be part of its
initial fixed goal.  But this is very hard to formalize in a rigorous
way  Any formalization one could create, is bound to have some holes in
it  And the system will have no desire to fix the holes, because its
structure is oriented around achieving its given fixed goal

A fixed-goal AGI system seems like a bit of a bitch, Friendliness-wise...

What if one supplied AIXI with a goal that explicitly involved modifying its
own goal, though?

So, the initial goal G = Be nice to people and other living things
according to the formalization F, AND, iteratively reformulate this goal in
a way that pleases the humans you're in contact with, according to the
formalization F1.

It is not clear to me that an AIXI with this kind of
self-modification-oriented goal would be unfriendly to humans.  It might be,
though.  It's not an approach I would trust particularly.

If one gave the AIXItl system the capability to modify the AIXItl algorithm
itself in such a way as to maximize expected goal achievement given its
historical observations, THEN one has a system that really goes beyond
AIXItl, and has a much less predictable behavior.  Hutter's theorems don't
hold anymore, for one thing (though related theorems might).

Anyway, since AIXI is uncomputable and AIXItl is totally infeasible, this is
a purely academic exercise!

-- Ben G








---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] Consciousness

2003-02-11 Thread Brad Wyble

 
 A good, if somewhat lightweight, article on the nature of mind and whether =
 silicon can eventually manifest conscioussness..
 
 http://www.theage.com.au/articles/2003/02/09/1044725672185.html
 
 Kevin

I don't know if consciousness debates are verbotten here or not, but I will say that I 
grow weary of Penrose worming his way into every debate/article with his hand-waving 
about quantum phenomenae.  Their only application to the debate is that they are 
unknown and therefore a subject of mystery, like consciousness.  The implied inference 
used by many, including this author, is that they are therefore related. 

He makes a good point about the failure of the neuron replacement thought experiment, 
but slipping and there is much in quantum physics to suggest it might be into the 
last paragraph left a bad taste in my mouth.  Ascribing the unknown to quantum 
physics, merely because it is mysterious, is no different than ascribing it to the 
Almighty.


-Brad

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] unFriendly AIXI

2003-02-11 Thread Bill Hibbard
On Tue, 11 Feb 2003, Ben Goertzel wrote:

 Eliezer wrote:
* a paper by Marcus Hutter giving a Solomonoff induction based theory
of general intelligence
 
  Interesting you should mention that.  I recently read through Marcus
  Hutter's AIXI paper, and while Marcus Hutter has done valuable work on a
  formal definition of intelligence, it is not a solution of Friendliness
  (nor do I have any reason to believe Marcus Hutter intended it as one).
 
  In fact, as one who specializes in AI morality, I was immediately struck
  by two obvious-seeming conclusions on reading Marcus Hutter's formal
  definition of intelligence:
 
  1)  There is a class of physically realizable problems, which humans can
  solve easily for maximum reward, but which - as far as I can tell - AIXI
  cannot solve even in principle;

 I don't see this, nor do I believe it...

I don't believe it either. Is this a reference to Penrose's
argument based on Goedel's Incompleteness Theorem (which is
wrong)?

  2)  While an AIXI-tl of limited physical and cognitive capabilities might
  serve as a useful tool,

 AIXI-tl is a totally computationally infeasible algorithm.  (As opposed to
 straight AIXI, which is an outright *uncomputable* algorithm).  I'm sure you
 realize this, but those who haven't read Hutter's stuff may not...

 If you haven't already, you should look at Juergen Schmidhuber's OOPS
 system, which is similar in spirit to AIXI-tl but less computationally
 infeasible.  (Although I don't think that OOPS is a viable pragmatic
 approach to AGI either, it's a little closer.)

  AIXI is unFriendly and cannot be made Friendly
  regardless of *any* pattern of reinforcement delivered during childhood.

 This assertion doesn't strike me as clearly false  But I'm not sure why
 it's true either.

The formality of Hutter's definitions can give the impression
that they cannot evolve. But they are open to interactions
with the external environment, and can be influenced by it
(including evolving in response to it). If the reinforcement
values are for human happiness, then the formal system and
humans together form a symbiotic system. This symbiotic
system is where you have to look for the friendliness. This
is part of an earlier discussion at:

  http://www.mail-archive.com/agi@v2.listbox.com/msg00606.html

Cheers,
Bill

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] unFriendly AIXI

2003-02-11 Thread Ben Goertzel

For the third grade, my oldest son Zar went to a progressive charter school
where they did one silly thing: each morning in homeroom the kids had to
write on a piece of paper what their goal for the day was.  Then at the end
of the day they had to write down how well they did at achieving their goal.

Being a Goertzel, Zar started out with My goal is to meet my goal and
after a few days started using My goal is not to meet my goal.

Soon many of the boys in his class were using My goal is not to meet my
goal.

Self-referential goals were banned in the school ... but soon, the silly
goal-setting exercise was abolished (saving the kids a bit of time-wasting
each day).

What happens when AIXI is given the goal My goal is not to meet my goal?
;-)

I suppose its behavior becomes essentially random?

If one started a Novamente system off with the prime goal My goal is not to
meet my goal, it would probably end up de-emphasizing and eventually
killing this goal.  Its long-term dynamics would not be random, because some
other goal (or set of goals) would arise in the system and become dominant.
But it's hard to say in advance what those would be.

-- Ben G



 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
 Behalf Of Ben Goertzel
 Sent: Tuesday, February 11, 2003 4:33 PM
 To: [EMAIL PROTECTED]
 Subject: RE: [agi] unFriendly AIXI



  The formality of Hutter's definitions can give the impression
  that they cannot evolve. But they are open to interactions
  with the external environment, and can be influenced by it
  (including evolving in response to it). If the reinforcement
  values are for human happiness, then the formal system and
  humans together form a symbiotic system. This symbiotic
  system is where you have to look for the friendliness. This
  is part of an earlier discussion at:
 
http://www.mail-archive.com/agi@v2.listbox.com/msg00606.html
 
  Cheers,
  Bill

 Bill,

 What you say is mostly true.

 However, taken literally Hutter's AGI designs involve a fixed,
 precisely-defined goal function.

 This strikes me as an unsafe architecture in the sense that we
 may not get
 the goal exactly right the first time around.

 Now, if humans iteratively tweak the goal function, then indeed, we have a
 synergetic system, whose dynamics include the dynamics of the
 goal-tweaking
 humans...

 But what happens if the system interprets its rigid goal to imply that it
 should stop humans from tweaking its goal?

 Of course, the goal function should be written in such a way as to make it
 unlikely the system will draw such an implication...

 It's also true that tweaking a superhumanly intelligent system's goal
 function may be very difficult for us humans with our limited
 intelligence.

 Making the goal function adaptable makes AIXItl into something a bit
 different... and making the AIXItl code rewritable by AIXItl makes it into
 something even more different...

 -- Ben G

 ---
 To unsubscribe, change your address, or temporarily deactivate
 your subscription,
 please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] unFriendly AIXI

2003-02-11 Thread Eliezer S. Yudkowsky
Ben Goertzel wrote:


AIXI and AIXItl are systems that are designed to operate with an initial
fixed goal.  As defined, they don't modify the overall goal they try to
achieve, they just try to achieve this fixed goal as well as possible
through adaptively determining their actions.

Basically, at each time step, AIXI searches through the space of all
programs to find the program that, based on its experience, will best
fulfill its given goal.  It then lets this best program run and determine
its next action.  Based on that next action, it has a new program space
search program... etc.

AIXItl does the same thing but it restricts the search to a finite space of
programs, hence it's a computationally possible (but totally impractical)
algorithm.

The harmfulness or benevolence of an AIXI system is therefore closely tied
to the definition of the goal that is given to the system in advance.


Actually, Ben, AIXI and AIXI-tl are both formal systems; there is no 
internal component in that formal system corresponding to a goal 
definition, only an algorithm that humans use to determine when and how 
hard they will press the reward button.

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


RE: [agi] unFriendly AIXI

2003-02-11 Thread Bill Hibbard
Ben,

On Tue, 11 Feb 2003, Ben Goertzel wrote:

  The formality of Hutter's definitions can give the impression
  that they cannot evolve. But they are open to interactions
  with the external environment, and can be influenced by it
  (including evolving in response to it). If the reinforcement
  values are for human happiness, then the formal system and
  humans together form a symbiotic system. This symbiotic
  system is where you have to look for the friendliness. This
  is part of an earlier discussion at:
 
http://www.mail-archive.com/agi@v2.listbox.com/msg00606.html
 
  Cheers,
  Bill

 Bill,

 What you say is mostly true.

 However, taken literally Hutter's AGI designs involve a fixed,
 precisely-defined goal function.

 This strikes me as an unsafe architecture in the sense that we may not get
 the goal exactly right the first time around.

 Now, if humans iteratively tweak the goal function, then indeed, we have a
 synergetic system, whose dynamics include the dynamics of the goal-tweaking
 humans...

 But what happens if the system interprets its rigid goal to imply that it
 should stop humans from tweaking its goal?
 . . .

The key thing is that Hutter's system is open - it reads
data from the external world. And there is no essential
difference between data and code (all data needs is an
interpreter to become code). So evolving values (goals)
can come from the external world.

We can draw a system boundary around any combination of
the formal system and the external world. By defining
reinforcement values for human happiness, system values
are equated to human values and the friendly system is
the symbiosis of the formal system and humans. The formal
values are fixed, but to human values which are not fixed
but can evolve.

Cheers,
Bill

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] unFriendly AIXI

2003-02-11 Thread Ben Goertzel
  The harmfulness or benevolence of an AIXI system is therefore
 closely tied
  to the definition of the goal that is given to the system in advance.

 Actually, Ben, AIXI and AIXI-tl are both formal systems; there is no
 internal component in that formal system corresponding to a goal
 definition, only an algorithm that humans use to determine when and how
 hard they will press the reward button.

 --
 Eliezer S. Yudkowsky

Well, the definitions of AIXI and AIXItl assume the existence of a reward
function or goal function (denoted V in the paper).

The assumption of the math is that this reward function is specified
up-front, before AIXI/AIXItl starts running.

If the reward function is allowed to change adaptively, based on the
behavior of the AIXI/AIXItl algorithm, then the theorems don't work anymore,
and you have a different sort of synergetic system such as Bill Hibbard
was describing.

If human feedback IS the reward function, then you have a case where the
reward function may well change adaptively based on the AI system's
behavior.

Whether the system will ever achieve any intelligence at all then depends on
how clever the humans are in doing the rewarding... as i said, Hutter's
theorems about intelligence don't apply...

-- Ben



---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] unFriendly AIXI

2003-02-11 Thread Eliezer S. Yudkowsky
Ben Goertzel wrote:


The harmfulness or benevolence of an AIXI system is therefore closely tied
to the definition of the goal that is given to the system in advance.


Under AIXI the goal is not given to the system in advance; rather, the 
system learns the humans' goal pattern through Solomonoff induction on the 
reward inputs.  Technically, in fact, it would be entirely feasible to 
give AIXI *only* reward inputs, although in this case it might require a 
long time for AIXI to accumulate enough data to constrain the 
Solomonoff-induced representation to a sufficiently detailed model of 
reality that it could successfully initiate complex actions.  The utility 
of the non-reward input is that it provides additional data, causally 
related to the mechanisms producing the reward input, upon which 
Solomonoff induction can also be performed.  Agreed?

It's a very different sort of setup than Novamente, because

1) a Novamente will be allowed to modify its own goals based on its
experience.


Depending on the pattern of inputs and rewards, AIXI will modify its 
internal representation of the algorithm which it expects to determine 
future rewards.  Would you say that this is roughly analogous to 
Novamente's learning of goals based on experience, or is there in your 
view a fundamental difference?  And if so, is AIXI formally superior or in 
some way inferior to Novamente?

2) a Novamente will be capable of spontaneous behavior as well as explicitly
goal-directed behavior


If the purpose of spontaneous behavior is to provoke learning experiences, 
this behavior is implicit in AIXI as well, though not obviously so.  I'm 
actually not sure about this because Hutter doesn't explicitly discuss it. 
 But it looks to me like AIXI, under its formal definition, emergently 
exhibits curiosity wherever there are, for example, two equiprobable 
models of reality which determine different rewards and can be 
distinguished by some test.  What we interpret as spontaneous behavior 
would then emerge from a horrendously uncomputable exploration of all 
possible realities to find tests which are ultimately likely to result in 
distinguishing data, but in ways which are not at all obvious to any human 
observer.  Would it be fair to say that AIXI's spontaneous behavior is 
formally superior to Novamente's spontaneous behavior?

I'm not used to thinking about fixed-goal AGI systems like AIXI,
actually

The Friendliness and other qualities of such a system seem to me to depend
heavily on the goal chosen.


Again, AIXI as a formal system has no goal definition.  [Note:  I may be 
wrong about this; Ben Goertzel and I seem to have acquired different 
models of AIXI and it is very possible that mine is the wrong one.]  It is 
tempting to think of AIXI as Solomonoff-inducing a goal pattern from its 
rewards, and Solomoff-inducing reality from its main input channel, but 
actually AIXI simultaneously induces the combined reality-and-reward 
pattern from both the reward channel and the input channel simultaneously. 
 In theory AIXI could operate on the reward channel alone; it just might 
take a long time before the reward channel gave enough data to constrain 
its reality-and-reward model to the point where AIXI could effectively 
model reality and hence generate complex reward-maximizing actions.

For instance, what if the system's goal were to prove as many complex
mathematical theorems as possible (given a certain axiomatizaton of math,
and a certain definition of complexity).  Then it would become dangerous in
the long run when it decided to reconfigure all matter in the universe to
increase its brainpower.

So you want be nice to people and other living things to be part of its
initial fixed goal.  But this is very hard to formalize in a rigorous
way  Any formalization one could create, is bound to have some holes in
it  And the system will have no desire to fix the holes, because its
structure is oriented around achieving its given fixed goal

A fixed-goal AGI system seems like a bit of a bitch, Friendliness-wise...


If the humans see that AIXI seems to be dangerously inclined toward just 
proving math theorems, they might decide to press the reward button when 
AIXI provides cures for cancer, or otherwise helps people.  AIXI would 
then modify its combined reality-and-reward representation accordingly to 
embrace the new simplest explanation that accounted for *all* the data, 
i.e., its reward function would then have to account for mathematical 
theorems *and* cancer cures *and* any other kind of help that humans had, 
in the past, pressed the reward button for.

Would you say this is roughly analogous to the kind of learning you intend 
Novamente to perform?  Or perhaps even an ideal form of such learning?

What if one supplied AIXI with a goal that explicitly involved modifying its
own goal, though?


Self-modification in any form completely breaks Hutter's definition, and 
you no longer have an AIXI any 

Re: [agi] unFriendly AIXI

2003-02-11 Thread Eliezer S. Yudkowsky
Ben Goertzel wrote:



Huh.  We may not be on the same page.  Using:
http://www.idsia.ch/~marcus/ai/aixigentle.pdf

Page 5:

The general framework for AI might be viewed as the design and study of
intelligent agents [RN95]. An agent is a cybernetic system with some
internal state, which acts with output yk on some environment in cycle k,
perceives some input xk from the environment and updates its internal
state. Then the next cycle follows. We split the input xk into a regular
part x0k and a reward rk, often called reinforcement feedback. From time
to time the environment provides non-zero reward to the agent.
The task of
the agent is to maximize its utility, defined as the sum of
future rewards.

I didn't see any reward function V defined for AIXI in any of the Hutter
papers I read, nor is it at all clear how such a V could be
defined, given
that the internal representation of reality produced by Solomonoff
induction is not fixed enough for any reward function to operate on it in
the same way that, e.g., our emotions bind to our own standardized
cognitive representations.


Quite literally, we are not on the same page ;)


Thought so...


Look at page 23, Definition 10 of the intelligence ordering relation
(which says what it means for one system to be more intelligent than
another).  And look at the start of Section 4.1, which Definition 10 lives
within.

The reward function V is defined there, basically as cumulative reward over
a period of time.  It's used all thru Section 4.1, and following that, it's
used mostly implicitly inside the intelligence ordering relation.


The reward function V however is *not* part of AIXI's structure; it is 
rather a test *applied to* AIXI from outside as part of Hutter's 
optimality proof.  AIXI itself is not given V; it induces V via Solomonoff 
induction on past rewards.  V can be at least as flexible as any criterion 
a (computable) human uses to determine when and how hard to press the 
reward button, nor is AIXI's approximation of V fixed at the start.  Given 
this, would you regard AIXI as formally approximating the kind of goal 
learning that Novamente is supposed to do?

As Definition 10 makes clear, intelligence is defined relative to a fixed
reward function.


A fixed reward function *outside* AIXI, so that the intelligence of AIXI 
can be defined relative to it... or am I wrong?

 What the theorems about AIXItl state is that, given a
fixed reward function, the AIXItl can do as well as any other algorithm at
achieving this reward function, if you give it computational resources equal
to those that the other algorithm got, plus a constant.  But the constant is
fucking HUGE.


Actually, I think AIXItl is supposed to do as well as a tl-bounded 
algorithm given t2^l resources... though again perhaps I am wrong.

Whether you specify the fixed reward function in its cumulative version or
not doesn't really matter...


Actually, AIXI's fixed horizon looks to me like it could give rise to some 
strange behaviors, but I think Hutter's already aware that this is 
probably AIXI's weakest link.

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


RE: [agi] unFriendly AIXI

2003-02-11 Thread Ben Goertzel


  Given
 this, would you regard AIXI as formally approximating the kind of goal
 learning that Novamente is supposed to do?

Sorta.. but goal-learning is not the complete motivational structure of
Novamente... just one aspect

  As Definition 10 makes clear, intelligence is defined relative
 to a fixed
  reward function.

 A fixed reward function *outside* AIXI, so that the intelligence of AIXI
 can be defined relative to it... or am I wrong?

No, you're right.


   What the theorems about AIXItl state is that, given a
  fixed reward function, the AIXItl can do as well as any other
 algorithm at
  achieving this reward function, if you give it computational
 resources equal
  to those that the other algorithm got, plus a constant.  But
 the constant is
  fucking HUGE.

 Actually, I think AIXItl is supposed to do as well as a tl-bounded
 algorithm given t2^l resources... though again perhaps I am wrong.

Ah, so the constant is multiplicative rather than additive.  You're probably
right.. I haven't looked at those details for a while (I read the paper
moderately carefully several months ago, and just glanced at it briefly now
in the context of this discussion).  But that doesn't make the algorithm any
better ;-)

Now that I stretch my aged memory, I recall that Hutter's other papers give
variations on the result, e.g.

http://www.hutter1.de/ai/pfastprg.htm

gives a multiplicative factor of 5 and some additive term.  I think the
result in that paper could be put together with AIXItl though he hasn't done
so yet.

  Whether you specify the fixed reward function in its cumulative
 version or
  not doesn't really matter...

 Actually, AIXI's fixed horizon looks to me like it could give
 rise to some
 strange behaviors, but I think Hutter's already aware that this is
 probably AIXI's weakest link.

yeah, that assumption was clearly introduced to make the theorems easier to
prove.  I don't think it's essential to the theory, really.

ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] unFriendly AIXI

2003-02-11 Thread Eliezer S. Yudkowsky
Ben Goertzel wrote:


Yeah, you're right, I mis-spoke.  The theorems assume the goal function is
known in advance -- but not known to the system, just known to the entity
defining and estimating the system's intelligence and giving the rewards.

I was implicitly assuming the case in which the goal was encapsulated in a
goal-definition program of some sort, which was hooked up to AIXI in
advance; but that is not the only case.


Actually, there's no obvious way you could ever include V in AIXI, at all. 
 V would have to operate as a predicate on internal representations of 
reality that have no fixed format or pattern.  At most you might be able 
to define a V that operates as a predicate on AIXI's inputs, in which case 
you can dispense with the separate reward channel.  In fact this is 
formally equivalent to AIXI, since it equates to an AIXI with an input 
channel I and a reward channel that is deterministically V(I).

It's a very different sort of setup than Novamente, because

1) a Novamente will be allowed to modify its own goals based on its
experience.


Depending on the pattern of inputs and rewards, AIXI will modify its
internal representation of the algorithm which it expects to determine
future rewards.  Would you say that this is roughly analogous to
Novamente's learning of goals based on experience, or is there in your
view a fundamental difference?  And if so, is AIXI formally
superior or in some way inferior to Novamente?


Well, AIXI is superior to any computable algorithm, in a sense.  If you had
the infinite-computing-power hardware that it requires, it would be pretty
damn powerful ;-p  But so would a lot of other approaches!!  Infinite
computing power provides AI's with a lot of axle grease!!


Obviously it is not AIXI's purpose to be implemented.  What AIXI defines 
rather is an abstraction that lets us talk more easily about certain kinds 
of intelligence.  If any AI program we could conceivably want to build is 
an imperfect approximation of AIXI, that is an interesting property of 
AIXI.  If an AI program we want to build is *superior* to AIXI then that 
is an *extremely* interesting property.

The reason I asked the question was not to ask whether AIXI is 
pragmatically better as a design strategy than Novamente.  What I was 
asking you rather is if, looking at AIXI, you see something *missing* that 
would be present in Novamente.  In other words, *if* you had an infinitely 
powerful computer processor, is there a reason why you would *not* 
implement AIXI on it, and would instead prefer Novamente, even if it had 
to run on a plain old cluster?

If the purpose of spontaneous behavior is to provoke learning
experiences,
this behavior is implicit in AIXI as well, though not obviously so.  I'm
actually not sure about this because Hutter doesn't explicitly
discuss it.


Well, you could argue that if Novamente is so good, AIXI will eventually
figure out how to emulate Novamente, since Novamente is just one of the many
programs in the space it searches!!

I am really not very interested in comparing AIXI to Novamente, because they
are not comparable: AIXI assumes infinite computing power and Novamente does
not.


We aren't comparing AIXI's design to Novamente's design so much as we're 
comparing AIXI's *kind of intelligence* to Novamente's *kind of 
intelligence*.  Does Novamente have something AIXI is missing?  Or does 
AIXI have strictly more intelligence than Novamente?

Actually, given the context of Friendliness, what we're interested in is 
not so much intelligence as interaction with humans; under this view, 
for example, giving humans a superintelligently deduced cancer cure is 
just one way of interacting with humans.  Looking at AIXI and Novamente, 
do you see any way that Novamente interacts with humans in a way that AIXI 
cannot?

AIXItl, on the other hand, is a finite-computing-power program.  In
principle it can demonstrate spontaneous behaviors, but in practice, I think
it will not demonstrate many interesting spontaneous behaviors.  Because it
will spend all its time dumbly searching through a huge space of useless
programs!!

Also, not all of Novamente's spontaneous behaviors are even implicitly
goal-directed.  Novamente is a goal-oriented but not 100% goal-directed
system, which is one major difference from AIXI and AIXItl.


I agree that it is a major difference; does it mean that Novamente can 
interact with humans in useful or morally relevant ways of which AIXI is 
incapable?

 But it looks to me like AIXI, under its formal definition, emergently
exhibits curiosity wherever there are, for example, two equiprobable
models of reality which determine different rewards and can be
distinguished by some test.  What we interpret as spontaneous behavior
would then emerge from a horrendously uncomputable exploration of all
possible realities to find tests which are ultimately likely to result in
distinguishing data, but in ways which are not at all obvious to
any human

Re: [agi] unFriendly AIXI

2003-02-11 Thread Eliezer S. Yudkowsky
Eliezer S. Yudkowsky wrote:


Not really.  There is certainly a significant similarity between Hutter's
stuff and the foundations of Novamente, but there are significant
differences too.  To sort out the exact relationship would take me 
more than a few minutes' thought.

There are indeed major differences in the foundations.  Is there 
something useful or important that Novamente does, given its 
foundations, that you could not do if you had a physically realized 
infinitely powerful computer running Hutter's stuff?

Actually, you said that it would take you more than a few minutes thought 
to sort it all out, so let me ask a question which you can hopefully 
answer more quickly...

Do you *feel intuitively* that there is something useful or important 
Novamente does, given its foundations, that you could not do if you had a 
physically realized AIXI?

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


Re: [agi] unFriendly AIXI

2003-02-11 Thread Philip Sutton
Eliezer,

In this discussion you have just moved the focus to the superiority of 
one AGI approach versus another in terms of *interacting with 
humans*.

But once one AGI exists it's most likely not long before there are more 
AGIs and there will need to be a moral/ethical system to guide AGI-AGI 
interaction.  And with super clever AGIs around it's likely that that 
human modification speeds up leading the category 'human' to be a 
very loose term.  So we need a moral/ethical system to guide AGI-
once-were-human interactions.

So for these two reasons alone I think we need to start out thinking in 
more general terms that AGIs being focussed on 'interacting with 
humans'.

If you have an goal-modifying AGI it might figure this all out.  But why 
should the human designers/teachers not avoid the probem in the first 
place since were can anticipate the issue already fairly easily.

Of coursei n terms of the 'unFriendly AIXI' debate this issue of a tight 
focus on interaction with humans is of no significance, but it I think it is 
important in its own right. 

Cheers, Philip

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] unFriendly AIXI

2003-02-11 Thread Ben Goertzel

Hi,

 The reason I asked the question was not to ask whether AIXI is
 pragmatically better as a design strategy than Novamente.  What I was
 asking you rather is if, looking at AIXI, you see something
 *missing* that
 would be present in Novamente.  In other words, *if* you had an
 infinitely
 powerful computer processor, is there a reason why you would *not*
 implement AIXI on it, and would instead prefer Novamente, even if it had
 to run on a plain old cluster?

These are deep and worthwhile questions that I can't answer thoroughly off
the cuff, I'll have to put some thought into them and reply a little later.

There are other less fascinating but more urgent things in the queue
tonight,
alas ;-p

My intuitive feeling is that I'd rather implement Novamente but with AIXI
plugged in as the schema/predicate learning component.  In other words,
it's
clear that an infinitely capable procedure learning routine would be very
valuable for AGI.  But I don't really like AIXI's overall control structure,
and I need to think a bit about why.  ONE reason is that it's insanely
inefficient, but even if you remove consideration of efficiency, there may
be other problems with it too.


 Actually, given the context of Friendliness, what we're interested in is
 not so much intelligence as interaction with humans; under this view,
 for example, giving humans a superintelligently deduced cancer cure is
 just one way of interacting with humans.  Looking at AIXI and Novamente,
 do you see any way that Novamente interacts with humans in a way
 that AIXI  cannot?

Well, off the cuff, I'm not sure because I've thought about Novamente a lot
more than I've thought about AIXI.

I'll need to mull this over  It's certainly worth thinking about.

Novamente is fundamentally self-modifying (NOT the current codebase but
the long-term design).  Based on feedback from humans and its own self-
organization, it can completely revise its own codebase.  AIXI can't do
that.

Along with self-modification comes the ability to modify its
reward/punishment
receptors, and interpret what formerly would have been a reward as a
punishment...
[This won't happen often but is in principle a possibility]

I don't know if this behavior is in AIXI's repertoire... is it?

  Also, not all of Novamente's spontaneous behaviors are even implicitly
  goal-directed.  Novamente is a goal-oriented but not 100% goal-directed
  system, which is one major difference from AIXI and AIXItl.

 I agree that it is a major difference; does it mean that Novamente can
 interact with humans in useful or morally relevant ways of which AIXI is
 incapable?

Maybe... hmmm.

   In that case you cannot prove any of Hutter's
  theorems about them.  And if you can't prove theorems about
 them then they
  are nothing more than useless abstractions.  Since AIXI can never be
  implemented and AIXItl is so inefficient it could never do
 anything useful
  in practice.

 But they are very useful tools for talking about fundamental kinds of
 intelligence.

I am not sure whether they are or not.

  Well, sure ... it's *roughly analogous*, in the sense that it's
 experiential
  reinforcement learning, sure.

 Is it roughly analogous, but not really analogous, in the sense that
 Novamente can do something AIXI can't?

Well, Novamente will not follow the expectimax algorithm.  So it will
display
behaviors that AIXI will never display.

I'm having trouble, off the cuff and in a hurry, thinking about AIXI in the
context
of a human saying to it In my view, you should adjust your goal system for
this
reason

If a human says this to Novamente, it may consider the request and may
do so.  It may do so if this human has been right about a lot of things in
the past,
for example.

If a human says this to AIXI, how does AIXI react and why?  AIXI doesn't
have a goal
system in the same sense that Novamente does.  AIXI if it's smart enough co
uld
hypothetically figure out what the human meant and use this to modify its
current
operating program (but not its basic program-search mechanism, because AIXI
is not
self-modifying in such a strong sense)... if its history told it that
listening to
humans causes it to get rewarded.  But, it seems to me intuitively that the
modification AIXI would make in this case, would not constrain or direct
AIXI's
future development as strongly as the modification Novamente would make in
response
to the same human request.  I'm not 100% sure about this though, because my
mental model
of AIXI's dynamics is not that good, and I haven't tried to do the math
corresponding
to this scenario.

What do you think about AIXI's response to this scenario, Eliezer?

You seem to have your head more fully wrapped around AIXI than I do, at the
moment ;-)

I really should reread the paper, but I don't have time right now.

This little scenario I've just raised does NOT exhaust the potentially
important differences
between Novamente and AIXI, it's just one thing that happened to occur to

Re: [agi] unFriendly AIXI

2003-02-11 Thread Eliezer S. Yudkowsky
Bill Hibbard wrote:

On Tue, 11 Feb 2003, Ben Goertzel wrote:


Eliezer wrote:


Interesting you should mention that.  I recently read through Marcus
Hutter's AIXI paper, and while Marcus Hutter has done valuable work on a
formal definition of intelligence, it is not a solution of Friendliness
(nor do I have any reason to believe Marcus Hutter intended it as one).

In fact, as one who specializes in AI morality, I was immediately struck
by two obvious-seeming conclusions on reading Marcus Hutter's formal
definition of intelligence:

1)  There is a class of physically realizable problems, which humans can
solve easily for maximum reward, but which - as far as I can tell - AIXI
cannot solve even in principle;


I don't see this, nor do I believe it...


I don't believe it either. Is this a reference to Penrose's
argument based on Goedel's Incompleteness Theorem (which is
wrong)?


Oh, well, in that case, I'll make my statement more formal:

There exists a physically realizable, humanly understandable challenge C 
on which a tl-bounded human outperforms AIXI-tl for humanly understandable 
reasons.  Or even more formally, there exists a computable process P 
which, given either a tl-bounded uploaded human or an AIXI-tl, supplies 
the uploaded human with a greater reward as the result of strategically 
superior actions taken by the uploaded human.

:)

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


RE: [agi] unFriendly AIXI

2003-02-11 Thread Ben Goertzel

 Oh, well, in that case, I'll make my statement more formal:

 There exists a physically realizable, humanly understandable challenge C
 on which a tl-bounded human outperforms AIXI-tl for humanly
 understandable
 reasons.  Or even more formally, there exists a computable process P
 which, given either a tl-bounded uploaded human or an AIXI-tl, supplies
 the uploaded human with a greater reward as the result of strategically
 superior actions taken by the uploaded human.

 :)

 --
 Eliezer S. Yudkowsky

Hmmm.

Are you saying that given a specific reward function and a specific
environment,
the t1-bounded uploaded human with resources (t,l) will act so as to
maximize the reward function
better than AIXI-tl with resources (T,l) with T as specified by Hutter's
theorem of AIXI-tl optimality?

Presumably you're not saying that, because it would contradict his theorem?

So what clever loophole are you invoking?? ;-)

ben



---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] unFriendly AIXI

2003-02-11 Thread Eliezer S. Yudkowsky
Ben Goertzel wrote:
 Oh, well, in that case, I'll make my statement more formal:

 There exists a physically realizable, humanly understandable
 challenge C on which a tl-bounded human outperforms AIXI-tl for
 humanly understandable reasons.  Or even more formally, there exists
 a computable process P which, given either a tl-bounded uploaded
 human or an AIXI-tl, supplies the uploaded human with a greater
 reward as the result of strategically superior actions taken by the
 uploaded human.

 :)

 -- Eliezer S. Yudkowsky

 Hmmm.

 Are you saying that given a specific reward function and a specific
 environment, the t1-bounded uploaded human with resources (t,l) will
 act so as to maximize the reward function better than AIXI-tl with
 resources (T,l) with T as specified by Hutter's theorem of AIXI-tl
 optimality?

 Presumably you're not saying that, because it would contradict his
 theorem?

Indeed.  I would never presume to contradict Hutter's theorem.

 So what clever loophole are you invoking?? ;-)

An intuitively fair, physically realizable challenge with important 
real-world analogues, solvable by the use of rational cognitive reasoning 
inaccessible to AIXI-tl, with success strictly defined by reward (not a 
Friendliness-related issue).  It wouldn't be interesting otherwise.

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


Re: [agi] Godel and AIXI

2003-02-11 Thread Shane Legg

Which is more or less why I figured you weren't going to do
a Penrose on us as you would then fact the usual reply...

Which begs the million dollar question:

Just what is this cunning problem that you have in mind?

:)

Shane

Eliezer S. Yudkowsky wrote:

Shane Legg wrote:


Eliezer S. Yudkowsky wrote:


An intuitively fair, physically realizable challenge with important 
real-world analogues, solvable by the use of rational cognitive 
reasoning inaccessible to AIXI-tl, with success strictly defined by 
reward (not a Friendliness-related issue).  It wouldn't be 
interesting otherwise.


Give the AIXI a series mathematical hypotheses some of which are
Godelian like statements and asking the AIXI it if each statement
is true and then rewarding it for each correct answer?

I'm just guessing here... this seems too Penrose like, I suppose
you have something quite different?



Indeed.

Godel's Theorem is widely misunderstood.  It doesn't show that humans 
can understand mathematical theorems which AIs cannot.  It does not even 
show that there are mathematical truths not provable in the Principia 
Mathematica.

Godel's Theorem actually shows that *if* mathematics and the Principia 
Mathematica are consistent, *then* Godel's statement is true, but not 
provable in the Principia Mathematica.  We don't actually *know* that 
the Principia Mathematica, or mathematics itself, is consistent.  We 
just know we haven't yet run across a contradiction.  The rest is 
induction, not deduction.

The only thing we know is that *if* the Principia is consistent *then* 
Godel's statement is true but not provable in the Principia.  But in 
fact this statement itself can be proved in the Principia.  So there are 
no mathematical truths accessible to human deduction but not machine 
deduction.  Godel's statement is accessible neither to human deduction 
nor machine deduction.

Of course, Godel's statement is accessible to human *induction*.  But it 
is just as accessible to AIXI-tl's induction as well.  Moreover, any 
human reasoning process used to assign perceived truth to mathematical 
theorems, if it is accessible to the combined inductive and deductive 
reasoning of a tl-bounded human, is accessible to the pure inductive 
reasoning of AIXI-tl as well.

In prosaic terms, AIXI-tl would probably induce a Principia-like system 
for the first few theorems you showed it, but as soon as you punished it 
for getting Godel's Statement wrong, AIXI-tl would induce a more complex 
cognitive system, perhaps one based on induction as well as deduction, 
that assigned truth to Godel's statement.  At the limit AIXI-tl would 
induce whatever algorithm represented the physically realized 
computation you were using to invent and assign truth to Godel 
statements.  Or to be more precise, AIXI-tl would induce the algorithm 
the problem designer used to assign truth to mathematical theorems; 
perfectly if the problem designer is tl-bounded or imitable by a 
tl-bounded process; otherwise at least as well as any tl-bounded human 
could from a similar pattern of rewards.

Actually, humans probably aren't really all that good at spot-reading 
Godel statements.  If you get tossed a series of Godel statements and 
you learned to decode the diagonalization involved, so that you could 
see *something* was being diagonalized, then the inductive inertia of 
your success at declaring all those statements true would probably lead 
you to blindly declare the truth of your own unidentified Godel 
statement, thus falsifying it.  Thus I'd expect AIXI-tl to far 
outperform tl-bounded humans on any fair Godel-statement-spotting 
tournament (arranged by AIXI, of course).



---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


RE: [agi] unFriendly AIXI

2003-02-11 Thread Ben Goertzel

   So what clever loophole are you invoking?? ;-)

 An intuitively fair, physically realizable challenge with important
 real-world analogues, solvable by the use of rational cognitive reasoning
 inaccessible to AIXI-tl, with success strictly defined by reward (not a
 Friendliness-related issue).  It wouldn't be interesting otherwise.

 --
 Eliezer S. Yudkowsky

Well, when you're ready to spill, we're ready to listen ;)

I am guessing it utilizes the reward function in an interesting sort of
way...

ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] unFriendly AIXI

2003-02-11 Thread Ben Goertzel


 It seems to me that this answer *assumes* that Hutter's work is completely
 right, an assumption in conflict with the uneasiness you express in your
 previous email.

It's right as mathematics...

I don't think his definition of intelligence is the  maximally useful one,
though I think it's a reasonably OK one.

I have proposed a different but related definition of intelligence, before,
and have not been entirely satisfied with my own definition, either.  I like
mine better than Hutter's... but I have not proved any cool theorems about
mine...

 If Novamente can do something AIXI cannot, then Hutter's
 work is very highly valuable because it provides a benchmark against which
 this becomes clear.

 If you intuitively feel that Novamente has something AIXI doesn't, then
 Hutter's work is very highly valuable whether your feeling proves correct
 or not, because it's by comparing Novamente against AIXI that you'll learn
 what this valuable thing really *is*.  This holds true whether the answer
 turns out to be It's capability X that I didn't previously really know
 how to build, and hence didn't see as obviously lacking in AIXI or It's
 capability X that I didn't previously really know how to build, and
 hence didn't see as obviously emerging from AIXI.

 So do you still feel that Hutter's work tells you nothing of any use?

Well, it hasn't so far.

It may in the future.  If it does I'll say so ;-)

The thing is, I (like many others) thought of algorithms equivalent to AIXI
years ago, and dismissed them as useless.  What I didn't do is prove
anything
about these algorithms, I just thought of them and ignored them  Partly
because I didn't see how to prove the theorems, and partly because I thought
even once I proved the theorems, I wouldn't have anything pragmatically
useful...

-- Ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] AIXI and Solomonoff induction

2003-02-11 Thread Cliff Stabbert
Tuesday, February 11, 2003, 9:44:31 PM, Shane Legg wrote:

SL However even within this scenario the concept of fixed goal is
SL something that we need to be careful about.  The only real goal
SL of the AIXI system is to get as much reward as possible from its
SL environment.  A goal is just our description of what that means.
SL If the AI gets reward for winning at chess then quickly it will get
SL very good at chess.  If it then starts getting punished for winning
SL it will then quickly switch to losing at chess.  Has the goal of
SL the system changed?  Perhaps not.  Perhaps the goal always was:
SL Win at chess up to point x in time and then switch to losing.
SL So we could say that the goal was always fixed, it's just that up
SL to point x in time the AI thought the goal was to alway win and it
SL wasn't until after point x in time that it realised that the real
SL goal was actually slightly more complex.  In which case does it make
SL any sense to talk about AIXI as being limited by having fixed goals?
SL I think not.

Perhaps someone can clarify some issues for me.

I'm not good at math -- I can't follow the AIXI materials and I don't
know what Solomonoff induction is.  So it's unclear to me how a
certain goal is mathematically defined in this uncertain, fuzzy
universe. 

What I'm assuming, at this point, is that AIXI and Solomonoff
induction depend on operation in a somehow predictable universe -- a
universe with some degree of entropy, so that its data is to some
extent compressible.  Is that more or less correct?

And in that case, goals can be defined by feedback given to the
system, because the desired behaviour patterns it induces from the
feedback *predictably* lead to the desired outcomes, more or less?

I'd appreciate if someone could tell me if I'm right or wrong on this,
or point me to some plain english resources on these issues, should
they exist.  Thanks.

--
Cliff

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] unFriendly AIXI

2003-02-11 Thread Eliezer S. Yudkowsky
Ben Goertzel wrote:

 It's right as mathematics...

 I don't think his definition of intelligence is the  maximally useful
 one, though I think it's a reasonably OK one.

 I have proposed a different but related definition of intelligence,
 before, and have not been entirely satisfied with my own definition,
 either.  I like mine better than Hutter's... but I have not proved any
 cool theorems about mine...

Can Hutter's AIXI satisfy your definition?

 If Novamente can do something AIXI cannot, then Hutter's work is very
 highly valuable because it provides a benchmark against which this
 becomes clear.

 If you intuitively feel that Novamente has something AIXI doesn't,
 then Hutter's work is very highly valuable whether your feeling
 proves correct or not, because it's by comparing Novamente against
 AIXI that you'll learn what this valuable thing really *is*.  This
 holds true whether the answer turns out to be It's capability X that
 I didn't previously really know how to build, and hence didn't see as
 obviously lacking in AIXI or It's capability X that I didn't
 previously really know how to build, and hence didn't see as
 obviously emerging from AIXI.

 So do you still feel that Hutter's work tells you nothing of any use?

 Well, it hasn't so far.

 It may in the future.  If it does I'll say so ;-)

 The thing is, I (like many others) thought of algorithms equivalent to
 AIXI years ago, and dismissed them as useless.  What I didn't do is
 prove anything about these algorithms, I just thought of them and
 ignored them  Partly because I didn't see how to prove the
 theorems, and partly because I thought even once I proved the theorems,
 I wouldn't have anything pragmatically useful...

It's not *about* the theorems.  It's about whether the assumptions
**underlying** the theorems are good assumptions to use in AI work.  If
Novamente can outdo AIXI then AIXI's assumptions must be 'off' in some way
and knowing this *explicitly*, as opposed to having a vague intuition
about it, cannot help but be valuable.

Again, it sounds to me like, in this message, you're taking for *granted*
that AIXI and Novamente have the same theoretical foundations, and that
hence the only issue is design and how much computing power is needed, in
which case I can see why it would be intuitively straightforward to you
that (a) Novamente is a better approach than AIXI and (b) AIXI has nothing
to say to you about the pragmatic problem of designing Novamente, nor are
its theorems relevant in building Novamente, etc.  But that's exactly the
question I'm asking you.  *Do* you believe that Novamente and AIXI rest on
the same foundations?

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


Re: [agi] AIXI and Solomonoff induction

2003-02-11 Thread Shane Legg

Hi Cliff,


I'm not good at math -- I can't follow the AIXI materials and I don't
know what Solomonoff induction is.  So it's unclear to me how a
certain goal is mathematically defined in this uncertain, fuzzy
universe. 

In AIXI you don't really define a goal as such.  Rather you have
an agent (the AI) that interacts with a world and as part of that
interaction the agent gets occasional reward signals.  The agent's
job is to maximise the amount of reward it gets.

So, if the environment contains me and I show the AI chess positions
and interpret its outputs as being moves that the AI wants to make
and then give the AI reward when ever it wins... then you could say
that the goal of the system is to win at chess.

Equally we could also mathematically define the relationship between
the input data, output data and the reward signal for the AI.  This
would be a mathematically defined environment and again we could
interpret part of this as being the goal.

Clearly the relationship between the input data, the output data and
the reward signal has to be in some sense computable for such a system
to work (I say in some sense as the environment doesn't have to be
deterministic it just has to have computaionally compressible
regularities).  That might see restrictive but if it wasn't the case
then AI on a computer is simply impossible as there would be no
computationally expressible solution anyway.  It's also pretty clear
that the world that we live in does have a lot of computationally
expressible regularities.



What I'm assuming, at this point, is that AIXI and Solomonoff
induction depend on operation in a somehow predictable universe -- a
universe with some degree of entropy, so that its data is to some
extent compressible.  Is that more or less correct?


Yes, if the universe is not somehow predicatble in the sense of
being compressible then the AI will be screwed.  It doesn't have
to be prefectly predictable; it just can't be random noise.



And in that case, goals can be defined by feedback given to the
system, because the desired behaviour patterns it induces from the
feedback *predictably* lead to the desired outcomes, more or less?


Yeah.



I'd appreciate if someone could tell me if I'm right or wrong on this,
or point me to some plain english resources on these issues, should
they exist.  Thanks.


The work is very new and there aren't, as far as I know, alternate
texts on the subject, just Marcus Hutter's various papers.
I am planning on writing a very simple introduction to Solomonoff
Induction and AIXI before too long that leaves out a lot of the
maths and concentrates on the key concepts.  Aside from being a good
warm up before I start working with Marcus soon, I think it could
be useful as I feel that the real significance of his work is being
missed by a lot of people out there due to all the math involved.

Marcus has mentioned that he might write a book about the subject
at some time but seemed to feel that the area needed more time to
mature before then as there is still a lot of work to be done and
important questions to explore... some of which I am going to be
working on :)



I should add, the example you gave is what raised my questions: it
seems to me an essentially untrainable case because it presents a
*non-repeatable* scenario.


In what sense is it untrainable?  The system learns to win at chess.
It then start getting punished for winning and switches to losing.
I don't see what the problem is.



If I were to give to an AGI a 1,000-page book, and on the first 672
pages was written the word Not, it may predict that on the 673d page
will be the word Not..  But I could choose to make that page blank,
and in that scenario, as in the above, I don't see how any algorithm,
no matter how clever, could make that prediction (unless it included
my realtime brainscans, etc.)


Yep, even an AIXI super AGI isn't going to be psychic.  The thing is
that you can never be 100% certain based on finite evidence.  This is
a central problem with induction.  Perhaps in ten seconds gravity will
suddernly reverse and start to repel rather than attract.  Perhaps
gravity as we know it is just a physical law that only holds for the
first 13.7 billion years of the universe and then reverses?  It seems
very very very unlikely, but we are not 100% certain that it won't
happen.

Cheers
Shane



---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


Re[2]: [agi] AIXI and Solomonoff induction

2003-02-11 Thread Cliff Stabbert
Wednesday, February 12, 2003, 12:00:56 AM, Shane Legg wrote:

SL Yes, if the universe is not somehow predicatble in the sense of
SL being compressible then the AI will be screwed.  It doesn't have
SL to be prefectly predictable; it just can't be random noise.

Thanks for your responses.

Some more questions:

So Solomonoff induction, whatever that precisely is, depends on a
somehow compressible universe.  Do the AIXI theorems *prove* something
along those lines about our universe, or do they *assume* a
compressible universe (i.e. do they state IF the universe is somehow
compressible, these algorithms (given infinite resources) can figure
out how)?

Assuming the latter, does that mean that there is a mathematical
definition of 'pattern'?  As I stated I'm not a math head, but with
what little knowledge I have I find it hard to imagine pattern as a
definable entity, somehow.

To get down to cases:

 I should add, the example you gave is what raised my questions: it
 seems to me an essentially untrainable case because it presents a
 *non-repeatable* scenario.

SL In what sense is it untrainable?  The system learns to win at chess.
SL It then start getting punished for winning and switches to losing.
SL I don't see what the problem is.

OK, let's say you reward it for winning during the first 100 games,
then punish it for winning / reward it for losing during the next 100,
reward it for winning the next 100, etc.  Can it perceive that pattern?

Given infinite resources, could it determine that I am deciding to
punish or reward a win based on a pseudo-random (65536-cyclic or
whatever it's called) random number generator?

And if the compressibility of the Universe is an assumption, is
there a way we might want to clarify such an assumption, i.e., aren't
there numerical values that attach to the *likelihood* of gravity
suddenly reversing direction; numerical values attaching to the
likelihood of physical phenomena which spontaneously negate like the
chess-reward pattern; etc.?

In fact -- would the chess-reward pattern's unpredictability *itself*
be an indication of life?  I.e., doesn't Ockham's razor fail in the
case of, and possibly *only* in the case of, conscious beings*?


--
Cliff

*I can elaborate on this if necessary.

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]