subject:"RE\: \[agi\] unFriendly AIXI"

RE: [agi] unFriendly AIXI... and Novamente?

2003-02-12 Thread Ben Goertzel



 I can spot the problem in AIXI because I have practice looking for silent
 failures, because I have an underlying theory that makes it immediately
 obvious which useful properties are formally missing from AIXI, and
 because I have a specific fleshed-out idea for how to create
 moral systems
 and I can see AIXI doesn't work that way.  Is it really all that
 implausible that you'd need to reach that point before being able to
 create a transhuman Novamente?  Is it really so implausible that AI
 morality is difficult enough to require at least one completely dedicated
 specialist?

 --
 Eliezer S. Yudkowsky  http://singinst.org/

There's no question you've thought a lot more about AI morality than I
have... and I've thought about it a fair bit.

When Novamente gets to the point that its morality is a significant issue,
I'll be happy to get you involved in the process of teaching the system,
carefully studying the design and implementation, etc.

-- Ben G

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] unFriendly AIXI... and Novamente?

2003-02-12 Thread Ben Goertzel


 Your intuitions say... I am trying to summarize my impression of your
 viewpoint, please feel free to correct me... AI morality is a matter of
 experiential learning, not just for the AI, but for the programmers.

Also, we plan to start Novamente off with some initial goals embodying
ethical notions.  These are viewed as seeds of its ultimate ethical goals.

So it's not the case that we intend to rely ENTIRELY on experiential
learning; we intend to rely on experiential learning from an engineering
initial condition, not from a complete tabula rasa.

-- Ben G

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] unFriendly AIXI... and Novamente?

2003-02-12 Thread Ben Goertzel


Hi,

 2)  If you get the deep theory wrong, there is a strong possibility of a
 silent catastrophic failure: the AI appears to be learning
 everything just
 fine, and both you and the AI are apparently making all kinds of
 fascinating discoveries about AI morality, and everything seems to be
 going pretty much like your intuitions predict above, but when the AI
 crosses the cognitive threshold of superintelligence it takes actions
 which wipe out the human species as a side effect.

 AIXI, which is a completely defined formal system, definitely undergoes a
 failure of exactly this type.

*Definitely*, huh?  I don't really believe you...

I can see the direction your thoughts are going in

Supppose you're rewarding AIXI for acting as though it's a Friendly AI.

Then, by searching the space of all possible programs, it finds some
program P that causes it to act as though it's a Friendly AI, satisfying
humans thoroughly in this regard.

There's an issue that a lot of different programs P could fulfill this
criterion.

Among these are programs P that will cause AIXI to fool humans into thinking
it's Friendly, until such a point as AIXI has acquired enough physical power
to annihilate all humans -- and which, at that point, will cause AIXI to
annihilate all humans.

But I can't see why you think AIXI would be particularly likely to come up
with programs P of this nature.

Instead, my understanding is that AIXI is going to have a bias to come up
with the most compact program P that maximizes reward.

And I think it's unlikely that the most compact program P for impressing
humans with Friendliness is one that involves acting Friendly for a while,
then annihilating humanity.

You could argue that the system would maximize its long-term reward by
annihilating humanity, because after pesky humans are gone, it can simply
reward itself unto eternity without caring what we think.

But, if it's powerful enough to annihilate us, it's also probably powerful
enough to launch itself into space and reward itself unto eternity without
caring what we think, all by itself (an Honest Annie type scenario).  Why
would it prefer annihilate humans P to launch myself into space P?

But anyway, it seems to me that the way AIXI works is to maximize expected
reward assuming that its reward function continues pretty much as it has
in the past.  So AIXI is not going to choose programs P based on a desire
to bring about futures in which it can masturbatively maximize its own
rewards.  At least, that's my understanding, though I could be wrong.

This whole type of scenario is avoided by limitations on computational
resources, because I believe that impressing humans regarding Friendliness
by actually being Friendly is a simpler computational problem than
impressing humans regarding Friendliness by subtly emulating Friendliness
but really concealing murderous intentions.  Also, I'd note that in a
Novamente, one could most likely distinguish these two scenarios by looking
inside the system and studying the Atoms and maps therein.

Jeez, all this talk about the future of AGI really makes me want to stop
e-mailing and dig into the damn codebase and push Novamente a little closer
to being a really autonomous intelligence instead of a partially-complete
codebase with some narrow-AI applications !!! ;-p

-- Ben G



---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] unFriendly AIXI... and Novamente?

2003-02-12 Thread Alan Grimes

Eliezer S. Yudkowsky wrote:
 1)  AI morality is an extremely deep and nonobvious challenge which has 
 no significant probability of going right by accident.

 2)  If you get the deep theory wrong, there is a strong possibility of 
 a silent catastrophic failure: the AI appears to be learning everything 
 just fine, and both you and the AI are apparently making all kinds of
 fascinating discoveries about AI morality, and everything seems to be
 going pretty much like your intuitions predict above, but when the AI
 crosses the cognitive threshold of superintelligence it takes actions
 which wipe out the human species as a side effect.

 AIXI, which is a completely defined formal system, definitely undergoes 
 a failure of exactly this type.

You have not shown this at all. From everything you've said it seems
that you are trying to trick Ben into having so many misgivings about
his own work that he holds it up while you create your AI first. I hope
Ben will see through this deception and press ahead with novamente. -- A
project that I give even odds for sucess... 


-- 
I WANT A DEC ALPHA!!! =)
21364: THE UNDISPUTED GOD OF ALL CPUS.
http://users.rcn.com/alangrimes/
[if rcn.com doesn't work, try erols.com ]

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] unFriendly AIXI... and Novamente?

2003-02-12 Thread Eliezer S. Yudkowsky

Ben Goertzel wrote:

 Your intuitions say... I am trying to summarize my impression of your
 viewpoint, please feel free to correct me... AI morality is a
 matter of experiential learning, not just for the AI, but for the
 programmers.  To teach an AI morality you must give it the right
 feedback on moral questions and reinforce the right behaviors... and
 you must also learn *about* the deep issues of AI morality by raising
 a young AI.  It isn't pragmatically realistic to work out elaborate
 theories of AI morality in advance; you must learn what you need to
 know as you go along.  Moreover, learning what you need to know, as
 you go along, is a good strategy for creating a superintelligence...
 or at least, the rational estimate of the goodness of that strategy
 is sufficient to make it a good idea to try and create a
 superintelligence, and there aren't any realistic strategies that are
 better.  An informal, intuitive theory of AI morality is good enough
 to spark experiential learning in the *programmer* that carries you
 all the way to the finish line.  You'll learn what you need to know
 as you go along.  The most fundamental theoretical and design
 challenge is making AI happen, at all; that's the really difficult
 part that's defeated everyone else so far.  Focus on making AI
 happen.  If you can make AI happen, you'll learn how to create moral
 AI from the experience.

 Hmmm.  This is almost a good summary of my perspective, but you've
 still not come to grips with the extent of my uncertainty ;)

 I am not at all SURE that An informal, intuitive theory of AI morality
 is good enough to spark experiential learning in the *programmer* that
 carries you all the way to the finish line. where by the finish line
 you mean an AGI whose ongoing evolution will lead to beneficial effects
 for both humans and AGI's.

 I'm open to the possibility that it may someday become clear, as AGI
 work progresses, that a systematic theory of AGI morality is necessary
 in order to proceed safely.

You are, however, relying on experiential learning to tell you *whether* a 
systematic theory of AGI morality is necessary.  This is what I meant by 
trying to summarize your perspective as An informal, intuitive theory of 
AI morality is good enough to spark experiential learning in the 
*programmer* that carries you all the way to the finish line.

The problem is that if you don't have a systematic theory of AGI morality 
you can't know whether you *need* a systematic theory of AGI morality. 
For example, I have a systematic theory of AGI morality which says that a 
programmer doing such-and-such is likely to see such-and-such results, 
with the result that experiential learning by the programmer is likely to 
result in the programmer solving *some* necessary AGI problems - enough 
for the programmer to feel really enthusiastic about all the progress 
being made.  But when I model the programmer's expectations and the AGI's 
actions, I find that there are some classes of foundational error which, 
if the programmer's expectations are such as to make this error, the AGI's 
actions will not contradict the programmer's expectations until it's too late.

Let me give an example.  Suppose a programmer's intuitive theory of AGI 
morality is that the foundation of all morality is rational self-interest, 
and that the programmer's intuitive theory of AGI self-improvement 
trajectories is that AGIs are capable of improving at most linearly over 
time.  On this theory, the programmer creates an absolutely selfish AI.  I 
know enough evolutionary psychology to be terribly, terribly scared by 
this.  Others may not realize just how absolutely unhuman absolutely 
pragmatic selfishness is, but may at least realize that an *absolutely* 
selfish AI would be a poor idea.

But the programmer doesn't see anything wrong.  If you take an absolutely 
selfish entity that's relying on you for most of its information, and you 
tell it that behaving cooperatively serves its selfish interests, and you 
have enough control over it while it's growing up to slap it whenever it 
does something that doesn't accord with your theory of pragmatic 
cooperation, then the absolutely selfish AI will appear to be learning 
good behavior, looking much like the programmer expects, as a 
pragmatically selfish strategy, which is what the programmer thinks 
*should* be happening.

Once the AI's intelligence passes a certain point, it will realize that 
the programmer is mistaken on the following factual points:

1)  That it is selfishly pragmatic for a superintelligence to deal with 
humans economically rather than converting them to computronium.

2)  That the AI's intelligence growth rate is linear.

Explaining the logic behind selfishly pragmatic cooperation requires 
explaining deception as a game-theoretical strategy that certain forms 
of cooperation are intended to counter.  So at this point, the AI has the 
conceptual equipment to exploit the programmer.

Re: [agi] unFriendly AIXI... and Novamente?

2003-02-12 Thread Alan Grimes

This is slightly off-topic but no more so than the rest of the thread...

 1)  That it is selfishly pragmatic for a superintelligence to deal with
 humans economically rather than converting them to computronium.

For convenience, lets rephrase this 


the majority of arbitrarily generated superintelligences would prefer
to convert everything in the solar system into computronium than deal
with humans within their laws and social norms.


This rephrasing might not be perfectly fair and I invite anyone to
adjust it to their taste and prefferances.

Now here is my question, it's going to sound silly but there is quite a
bit behind it: 

Of what use is computronium to a superintelligence? 

This is not a troll or any other abuse of the members of the list. It is
no less serious or relevant than the assertion it addresses. 

I hope that many people on this list will answer this. I should warn you
about how I am going to treat those answers. Any answer in the negitive,
that the SI doesn't need vast quantities of computronium, will be
applauded. Any answer in the affirmative and which would fit in five
lines of text will be either wrong or so grossly incomplete as to be
utterly meaningless and unworthy of anything more than a tearse retort. 

Longer answers will be treated with much greater interest and will be
answered with far greater attention. My primary instrument in this will
be the question Why?. The answers, I expect, will either spiral into
circular reasoning or to such a ludacrous absurdities as to be totally
irrational. 

The utility of this debate will be to show that the need for a Grand
Theory of Friendliness is not something that needs to be argued as far
simpler and perfectly obvious engineering constraints common to
absolutly all technologies will be totally sufficient asside from the
more complex implementation. 

I want this list to be useful to me and not have to skim through
hundreds of e-mails watching the rabbi drive conversation into useless
spirals as he works on the implementation details of the real problems.
Really, I'm getting dizzy from all of this. Lets start walking in a
streight line now. =( 

-- 
I WANT A DEC ALPHA!!! =)
21364: THE UNDISPUTED GOD OF ALL CPUS.
http://users.rcn.com/alangrimes/
[if rcn.com doesn't work, try erols.com ]

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] unFriendly AIXI... and Novamente?

2003-02-12 Thread Alan Grimes

 Jonathan Standley wrote:
  Now here is my question, it's going to sound silly but there is
 quite a bit behind it:

  Of what use is computronium to a superintelligence?

 If the superintelligence perceives a need for vast computational
 resources, then computronium would indeed be very useful.  Assuming
 said SI is friendly to humans, one thing I can think of that *may*
 need such power would be certain megascale engineering projects.
 Keeping track of everything involved in, for example, opening a
 wormhole could require unimaginable resources. (this is just a wild
 guess, aside from a Stephen Hawking book or two, I'm rather clueless
 when it comes to quantum-ish stuff).

OK, that is a reasonable answer however I can't immagine even a dycen's
sphere (assuming it had a sufficiently regular design) would require
much more than what would fit on my desk to work out. 

 The smaller, more compact the components are in a system, the closer
 they can be to each other, reducing speed of light communications
 delays.  By my reasoning that is the only real advantage of
 computronium (unless energy efficiency  is an overwhelming concern).

Ofcourse, there's your tradeoff. It would seem that this would place an
upper bound on how much matter you would want to use before
communication delays start getting really annoying. (and hence cause the
evil AI to stop after consuming a county or two). 

 Imagine if one could create a new universe, and then move into it.
 This universe would be however you want it to be; you are omniscient
 and omnipotent within it. There are no limits once you move in.  In
 some sense, you could consider making such a universe a 'goal to end
 all goals', since literally anything that the creator wishes is
 possible and easy within the new universe.

A few people would find that emotionally rewarding. As for me, I rarely
play video games anymore. In the past I have found that the best games,
such as Dragon Warrior [sometimes Dragon Quest] IV required only 800kb
and provided a rich and detailed world on only an 8-bit processor with
hardly any ram. 

On balance, this idea is, practically speaking, pointless. It would be
much cheaper to deploy technology in this universe and tweak it as you
like. 

On a more personal note, when I was a little kid I once (maybe a few
times) had a dream where I had managed to escape into a metaverse which
had the topology of a torus and was somewhat red in color... In this
metaverse I could Reset  the universe to any pattern I chose and live
in it from the beginning in any way I chose. Anyway, that's waaay off
topick...

 Assuming all the above, the issue becomes 'what resources are required
 to reach the be-all end-all of goals?'

I don't beleive any such goal exists. 

 All of the energy of the visible universe, and 10 trillion years could
 be the minimum.  Or... the matter (converted to energy and
 computational structures) that makes up a single 50km object in the
 asteroid belt could be enough.  At this point in time, we have no way
 of even making an educated guess. If the requirements are towards the
 low end of the scale, even an AI with insane ambitions to godhood
 wouldn't need to turn the whole solar system into computronium

Now this gets interesting. 
Here we need to start thinking in terms of goals: 

A fairly minimal goal system would be to master mathematics, physics,
chemestry, engineering, and a number of other diciplines and have enough
capacity in reserve to persue any project one might be interested in,
mostly having to do with survival. Depending on your assumptions about
the efficacy of nanotech, such a device wouldn't be much bigger than the
HD in your computer. 

If one wanted to start doing grand experaments in this universe, such as
probing down to the plank length (10^-35 M) to see if you can dig your
way into some other universe you might need to build some kind of
reactor that could be quite large but not be much bigger than the moon.
Another method might involve constructing a particle accelerator
billions of miles long to take an electron or something close enough to
the speed of light to get to that scale... In that case you probably
wouldn't need anything larger that jupiter to do it. 

Can anyone else think of any better goals?

-- 
I WANT A DEC ALPHA!!! =)
21364: THE UNDISPUTED GOD OF ALL CPUS.
http://users.rcn.com/alangrimes/
[if rcn.com doesn't work, try erols.com ]

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] unFriendly AIXI

2003-02-11 Thread Eliezer S. Yudkowsky

Eliezer S. Yudkowsky wrote:

I recently read through Marcus 
Hutter's AIXI paper, and while Marcus Hutter has done valuable work on a 
formal definition of intelligence, it is not a solution of Friendliness 
(nor do I have any reason to believe Marcus Hutter intended it as one).

In fact, as one who specializes in AI morality, I was immediately struck 
by two obvious-seeming conclusions on reading Marcus Hutter's formal 
definition of intelligence:

1)  There is a class of physically realizable problems, which humans can 
solve easily for maximum reward, but which - as far as I can tell - AIXI 
cannot solve even in principle;

2)  While an AIXI-tl of limited physical and cognitive capabilities 
might serve as a useful tool, AIXI is unFriendly and cannot be made 
Friendly regardless of *any* pattern of reinforcement delivered during 
childhood.

Before I post further, is there *anyone* who sees this besides me?

Also, let me make clear why I'm asking this.  AIXI and AIXI-tl are formal 
definitions; they are *provably* unFriendly.  There is no margin for 
handwaving about future revisions of the system, emergent properties of 
the system, and so on.  A physically realized AIXI or AIXI-tl will, 
provably, appear to be compliant up until the point where it reaches a 
certain level of intelligence, then take actions which wipe out the human 
species as a side effect.  The most critical theoretical problems in 
Friendliness are nonobvious, silent, catastrophic, and not inherently fun 
for humans to argue about; they tend to be structural properties of a 
computational process rather than anything analogous to human moral 
disputes.  If you are working on any AGI project that you believe has the 
potential for real intelligence, you are obliged to develop professional 
competence in spotting these kinds of problems.  AIXI is a formally 
complete definition, with no margin for handwaving about future revisions. 
 If you can spot catastrophic problems in AI morality you should be able 
to spot the problem in AIXI.  Period.  If you cannot *in advance* see the 
problem as it exists in the formally complete definition of AIXI, then 
there is no reason anyone should believe you if you afterward claim that 
your system won't behave like AIXI due to unspecified future features.

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] unFriendly AIXI

2003-02-11 Thread Ben Goertzel


Eliezer wrote:
   * a paper by Marcus Hutter giving a Solomonoff induction based theory
   of general intelligence

 Interesting you should mention that.  I recently read through Marcus
 Hutter's AIXI paper, and while Marcus Hutter has done valuable work on a
 formal definition of intelligence, it is not a solution of Friendliness
 (nor do I have any reason to believe Marcus Hutter intended it as one).

 In fact, as one who specializes in AI morality, I was immediately struck
 by two obvious-seeming conclusions on reading Marcus Hutter's formal
 definition of intelligence:

 1)  There is a class of physically realizable problems, which humans can
 solve easily for maximum reward, but which - as far as I can tell - AIXI
 cannot solve even in principle;

I don't see this, nor do I believe it...

 2)  While an AIXI-tl of limited physical and cognitive capabilities might
 serve as a useful tool,

AIXI-tl is a totally computationally infeasible algorithm.  (As opposed to
straight AIXI, which is an outright *uncomputable* algorithm).  I'm sure you
realize this, but those who haven't read Hutter's stuff may not...

If you haven't already, you should look at Juergen Schmidhuber's OOPS
system, which is similar in spirit to AIXI-tl but less computationally
infeasible.  (Although I don't think that OOPS is a viable pragmatic
approach to AGI either, it's a little closer.)

 AIXI is unFriendly and cannot be made Friendly
 regardless of *any* pattern of reinforcement delivered during childhood.

This assertion doesn't strike me as clearly false  But I'm not sure why
it's true either.

Please share your argument...

-- Ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] unFriendly AIXI

2003-02-11 Thread RSbriggs

In a message dated 2/11/2003 10:17:07 AM Mountain Standard Time, [EMAIL PROTECTED] writes:

1) There is a class of physically realizable problems, which humans can 
solve easily for maximum reward, but which - as far as I can tell - AIXI 
cannot solve even in principle;

2) While an AIXI-tl of limited physical and cognitive capabilities might 
serve as a useful tool, AIXI is unFriendly and cannot be made Friendly 
regardless of *any* pattern of reinforcement delivered during childhood.

Before I post further, is there *anyone* who sees this besides me?


Can someone post a link to this?

Thanks!

RE: [agi] unFriendly AIXI

2003-02-11 Thread Ben Goertzel


  2)  While an AIXI-tl of limited physical and cognitive capabilities
  might serve as a useful tool, AIXI is unFriendly and cannot be made
  Friendly regardless of *any* pattern of reinforcement delivered during
  childhood.
 
  Before I post further, is there *anyone* who sees this besides me?

 Also, let me make clear why I'm asking this.  AIXI and AIXI-tl are formal
 definitions; they are *provably* unFriendly.  There is no margin for
 handwaving about future revisions of the system, emergent properties of
 the system, and so on.  A physically realized AIXI or AIXI-tl will,
 provably, appear to be compliant up until the point where it reaches a
 certain level of intelligence, then take actions which wipe out the human
 species as a side effect.  The most critical theoretical problems in
 Friendliness are nonobvious, silent, catastrophic, and not inherently fun
 for humans to argue about; they tend to be structural properties of a
 computational process rather than anything analogous to human moral
 disputes.  If you are working on any AGI project that you believe has the
 potential for real intelligence, you are obliged to develop professional
 competence in spotting these kinds of problems.  AIXI is a formally
 complete definition, with no margin for handwaving about future
 revisions.
   If you can spot catastrophic problems in AI morality you should be able
 to spot the problem in AIXI.  Period.  If you cannot *in advance* see the
 problem as it exists in the formally complete definition of AIXI, then
 there is no reason anyone should believe you if you afterward claim that
 your system won't behave like AIXI due to unspecified future features.

Eliezer,

AIXI and AIXItl are systems that are designed to operate with an initial
fixed goal.  As defined, they don't modify the overall goal they try to
achieve, they just try to achieve this fixed goal as well as possible
through adaptively determining their actions.

Basically, at each time step, AIXI searches through the space of all
programs to find the program that, based on its experience, will best
fulfill its given goal.  It then lets this best program run and determine
its next action.  Based on that next action, it has a new program space
search program... etc.

AIXItl does the same thing but it restricts the search to a finite space of
programs, hence it's a computationally possible (but totally impractical)
algorithm.

The harmfulness or benevolence of an AIXI system is therefore closely tied
to the definition of the goal that is given to the system in advance.

It's a very different sort of setup than Novamente, because

1) a Novamente will be allowed to modify its own goals based on its
experience.
2) a Novamente will be capable of spontaneous behavior as well as explicitly
goal-directed behavior

I'm not used to thinking about fixed-goal AGI systems like AIXI,
actually

The Friendliness and other qualities of such a system seem to me to depend
heavily on the goal chosen.

For instance, what if the system's goal were to prove as many complex
mathematical theorems as possible (given a certain axiomatizaton of math,
and a certain definition of complexity).  Then it would become dangerous in
the long run when it decided to reconfigure all matter in the universe to
increase its brainpower.

So you want be nice to people and other living things to be part of its
initial fixed goal.  But this is very hard to formalize in a rigorous
way  Any formalization one could create, is bound to have some holes in
it  And the system will have no desire to fix the holes, because its
structure is oriented around achieving its given fixed goal

A fixed-goal AGI system seems like a bit of a bitch, Friendliness-wise...

What if one supplied AIXI with a goal that explicitly involved modifying its
own goal, though?

So, the initial goal G = Be nice to people and other living things
according to the formalization F, AND, iteratively reformulate this goal in
a way that pleases the humans you're in contact with, according to the
formalization F1.

It is not clear to me that an AIXI with this kind of
self-modification-oriented goal would be unfriendly to humans.  It might be,
though.  It's not an approach I would trust particularly.

If one gave the AIXItl system the capability to modify the AIXItl algorithm
itself in such a way as to maximize expected goal achievement given its
historical observations, THEN one has a system that really goes beyond
AIXItl, and has a much less predictable behavior.  Hutter's theorems don't
hold anymore, for one thing (though related theorems might).

Anyway, since AIXI is uncomputable and AIXItl is totally infeasible, this is
a purely academic exercise!

-- Ben G








---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] unFriendly AIXI

2003-02-11 Thread Bill Hibbard

On Tue, 11 Feb 2003, Ben Goertzel wrote:

 Eliezer wrote:
* a paper by Marcus Hutter giving a Solomonoff induction based theory
of general intelligence
 
  Interesting you should mention that.  I recently read through Marcus
  Hutter's AIXI paper, and while Marcus Hutter has done valuable work on a
  formal definition of intelligence, it is not a solution of Friendliness
  (nor do I have any reason to believe Marcus Hutter intended it as one).
 
  In fact, as one who specializes in AI morality, I was immediately struck
  by two obvious-seeming conclusions on reading Marcus Hutter's formal
  definition of intelligence:
 
  1)  There is a class of physically realizable problems, which humans can
  solve easily for maximum reward, but which - as far as I can tell - AIXI
  cannot solve even in principle;

 I don't see this, nor do I believe it...

I don't believe it either. Is this a reference to Penrose's
argument based on Goedel's Incompleteness Theorem (which is
wrong)?

  2)  While an AIXI-tl of limited physical and cognitive capabilities might
  serve as a useful tool,

 AIXI-tl is a totally computationally infeasible algorithm.  (As opposed to
 straight AIXI, which is an outright *uncomputable* algorithm).  I'm sure you
 realize this, but those who haven't read Hutter's stuff may not...

 If you haven't already, you should look at Juergen Schmidhuber's OOPS
 system, which is similar in spirit to AIXI-tl but less computationally
 infeasible.  (Although I don't think that OOPS is a viable pragmatic
 approach to AGI either, it's a little closer.)

  AIXI is unFriendly and cannot be made Friendly
  regardless of *any* pattern of reinforcement delivered during childhood.

 This assertion doesn't strike me as clearly false  But I'm not sure why
 it's true either.

The formality of Hutter's definitions can give the impression
that they cannot evolve. But they are open to interactions
with the external environment, and can be influenced by it
(including evolving in response to it). If the reinforcement
values are for human happiness, then the formal system and
humans together form a symbiotic system. This symbiotic
system is where you have to look for the friendliness. This
is part of an earlier discussion at:

  http://www.mail-archive.com/agi@v2.listbox.com/msg00606.html

Cheers,
Bill

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] unFriendly AIXI

2003-02-11 Thread Ben Goertzel


For the third grade, my oldest son Zar went to a progressive charter school
where they did one silly thing: each morning in homeroom the kids had to
write on a piece of paper what their goal for the day was.  Then at the end
of the day they had to write down how well they did at achieving their goal.

Being a Goertzel, Zar started out with My goal is to meet my goal and
after a few days started using My goal is not to meet my goal.

Soon many of the boys in his class were using My goal is not to meet my
goal.

Self-referential goals were banned in the school ... but soon, the silly
goal-setting exercise was abolished (saving the kids a bit of time-wasting
each day).

What happens when AIXI is given the goal My goal is not to meet my goal?
;-)

I suppose its behavior becomes essentially random?

If one started a Novamente system off with the prime goal My goal is not to
meet my goal, it would probably end up de-emphasizing and eventually
killing this goal.  Its long-term dynamics would not be random, because some
other goal (or set of goals) would arise in the system and become dominant.
But it's hard to say in advance what those would be.

-- Ben G



 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
 Behalf Of Ben Goertzel
 Sent: Tuesday, February 11, 2003 4:33 PM
 To: [EMAIL PROTECTED]
 Subject: RE: [agi] unFriendly AIXI



  The formality of Hutter's definitions can give the impression
  that they cannot evolve. But they are open to interactions
  with the external environment, and can be influenced by it
  (including evolving in response to it). If the reinforcement
  values are for human happiness, then the formal system and
  humans together form a symbiotic system. This symbiotic
  system is where you have to look for the friendliness. This
  is part of an earlier discussion at:
 
http://www.mail-archive.com/agi@v2.listbox.com/msg00606.html
 
  Cheers,
  Bill

 Bill,

 What you say is mostly true.

 However, taken literally Hutter's AGI designs involve a fixed,
 precisely-defined goal function.

 This strikes me as an unsafe architecture in the sense that we
 may not get
 the goal exactly right the first time around.

 Now, if humans iteratively tweak the goal function, then indeed, we have a
 synergetic system, whose dynamics include the dynamics of the
 goal-tweaking
 humans...

 But what happens if the system interprets its rigid goal to imply that it
 should stop humans from tweaking its goal?

 Of course, the goal function should be written in such a way as to make it
 unlikely the system will draw such an implication...

 It's also true that tweaking a superhumanly intelligent system's goal
 function may be very difficult for us humans with our limited
 intelligence.

 Making the goal function adaptable makes AIXItl into something a bit
 different... and making the AIXItl code rewritable by AIXItl makes it into
 something even more different...

 -- Ben G

 ---
 To unsubscribe, change your address, or temporarily deactivate
 your subscription,
 please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] unFriendly AIXI

2003-02-11 Thread Eliezer S. Yudkowsky

Ben Goertzel wrote:


AIXI and AIXItl are systems that are designed to operate with an initial
fixed goal.  As defined, they don't modify the overall goal they try to
achieve, they just try to achieve this fixed goal as well as possible
through adaptively determining their actions.

Basically, at each time step, AIXI searches through the space of all
programs to find the program that, based on its experience, will best
fulfill its given goal.  It then lets this best program run and determine
its next action.  Based on that next action, it has a new program space
search program... etc.

AIXItl does the same thing but it restricts the search to a finite space of
programs, hence it's a computationally possible (but totally impractical)
algorithm.

The harmfulness or benevolence of an AIXI system is therefore closely tied
to the definition of the goal that is given to the system in advance.


Actually, Ben, AIXI and AIXI-tl are both formal systems; there is no 
internal component in that formal system corresponding to a goal 
definition, only an algorithm that humans use to determine when and how 
hard they will press the reward button.

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] unFriendly AIXI

2003-02-11 Thread Bill Hibbard

Ben,

On Tue, 11 Feb 2003, Ben Goertzel wrote:

  The formality of Hutter's definitions can give the impression
  that they cannot evolve. But they are open to interactions
  with the external environment, and can be influenced by it
  (including evolving in response to it). If the reinforcement
  values are for human happiness, then the formal system and
  humans together form a symbiotic system. This symbiotic
  system is where you have to look for the friendliness. This
  is part of an earlier discussion at:
 
http://www.mail-archive.com/agi@v2.listbox.com/msg00606.html
 
  Cheers,
  Bill

 Bill,

 What you say is mostly true.

 However, taken literally Hutter's AGI designs involve a fixed,
 precisely-defined goal function.

 This strikes me as an unsafe architecture in the sense that we may not get
 the goal exactly right the first time around.

 Now, if humans iteratively tweak the goal function, then indeed, we have a
 synergetic system, whose dynamics include the dynamics of the goal-tweaking
 humans...

 But what happens if the system interprets its rigid goal to imply that it
 should stop humans from tweaking its goal?
 . . .

The key thing is that Hutter's system is open - it reads
data from the external world. And there is no essential
difference between data and code (all data needs is an
interpreter to become code). So evolving values (goals)
can come from the external world.

We can draw a system boundary around any combination of
the formal system and the external world. By defining
reinforcement values for human happiness, system values
are equated to human values and the friendly system is
the symbiosis of the formal system and humans. The formal
values are fixed, but to human values which are not fixed
but can evolve.

Cheers,
Bill

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] unFriendly AIXI

2003-02-11 Thread Ben Goertzel

  The harmfulness or benevolence of an AIXI system is therefore
 closely tied
  to the definition of the goal that is given to the system in advance.

 Actually, Ben, AIXI and AIXI-tl are both formal systems; there is no
 internal component in that formal system corresponding to a goal
 definition, only an algorithm that humans use to determine when and how
 hard they will press the reward button.

 --
 Eliezer S. Yudkowsky

Well, the definitions of AIXI and AIXItl assume the existence of a reward
function or goal function (denoted V in the paper).

The assumption of the math is that this reward function is specified
up-front, before AIXI/AIXItl starts running.

If the reward function is allowed to change adaptively, based on the
behavior of the AIXI/AIXItl algorithm, then the theorems don't work anymore,
and you have a different sort of synergetic system such as Bill Hibbard
was describing.

If human feedback IS the reward function, then you have a case where the
reward function may well change adaptively based on the AI system's
behavior.

Whether the system will ever achieve any intelligence at all then depends on
how clever the humans are in doing the rewarding... as i said, Hutter's
theorems about intelligence don't apply...

-- Ben



---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] unFriendly AIXI

2003-02-11 Thread Eliezer S. Yudkowsky

Ben Goertzel wrote:


The harmfulness or benevolence of an AIXI system is therefore closely tied
to the definition of the goal that is given to the system in advance.


Under AIXI the goal is not given to the system in advance; rather, the 
system learns the humans' goal pattern through Solomonoff induction on the 
reward inputs.  Technically, in fact, it would be entirely feasible to 
give AIXI *only* reward inputs, although in this case it might require a 
long time for AIXI to accumulate enough data to constrain the 
Solomonoff-induced representation to a sufficiently detailed model of 
reality that it could successfully initiate complex actions.  The utility 
of the non-reward input is that it provides additional data, causally 
related to the mechanisms producing the reward input, upon which 
Solomonoff induction can also be performed.  Agreed?

It's a very different sort of setup than Novamente, because

1) a Novamente will be allowed to modify its own goals based on its
experience.


Depending on the pattern of inputs and rewards, AIXI will modify its 
internal representation of the algorithm which it expects to determine 
future rewards.  Would you say that this is roughly analogous to 
Novamente's learning of goals based on experience, or is there in your 
view a fundamental difference?  And if so, is AIXI formally superior or in 
some way inferior to Novamente?

2) a Novamente will be capable of spontaneous behavior as well as explicitly
goal-directed behavior


If the purpose of spontaneous behavior is to provoke learning experiences, 
this behavior is implicit in AIXI as well, though not obviously so.  I'm 
actually not sure about this because Hutter doesn't explicitly discuss it. 
 But it looks to me like AIXI, under its formal definition, emergently 
exhibits curiosity wherever there are, for example, two equiprobable 
models of reality which determine different rewards and can be 
distinguished by some test.  What we interpret as spontaneous behavior 
would then emerge from a horrendously uncomputable exploration of all 
possible realities to find tests which are ultimately likely to result in 
distinguishing data, but in ways which are not at all obvious to any human 
observer.  Would it be fair to say that AIXI's spontaneous behavior is 
formally superior to Novamente's spontaneous behavior?

I'm not used to thinking about fixed-goal AGI systems like AIXI,
actually

The Friendliness and other qualities of such a system seem to me to depend
heavily on the goal chosen.


Again, AIXI as a formal system has no goal definition.  [Note:  I may be 
wrong about this; Ben Goertzel and I seem to have acquired different 
models of AIXI and it is very possible that mine is the wrong one.]  It is 
tempting to think of AIXI as Solomonoff-inducing a goal pattern from its 
rewards, and Solomoff-inducing reality from its main input channel, but 
actually AIXI simultaneously induces the combined reality-and-reward 
pattern from both the reward channel and the input channel simultaneously. 
 In theory AIXI could operate on the reward channel alone; it just might 
take a long time before the reward channel gave enough data to constrain 
its reality-and-reward model to the point where AIXI could effectively 
model reality and hence generate complex reward-maximizing actions.

For instance, what if the system's goal were to prove as many complex
mathematical theorems as possible (given a certain axiomatizaton of math,
and a certain definition of complexity).  Then it would become dangerous in
the long run when it decided to reconfigure all matter in the universe to
increase its brainpower.

So you want be nice to people and other living things to be part of its
initial fixed goal.  But this is very hard to formalize in a rigorous
way  Any formalization one could create, is bound to have some holes in
it  And the system will have no desire to fix the holes, because its
structure is oriented around achieving its given fixed goal

A fixed-goal AGI system seems like a bit of a bitch, Friendliness-wise...


If the humans see that AIXI seems to be dangerously inclined toward just 
proving math theorems, they might decide to press the reward button when 
AIXI provides cures for cancer, or otherwise helps people.  AIXI would 
then modify its combined reality-and-reward representation accordingly to 
embrace the new simplest explanation that accounted for *all* the data, 
i.e., its reward function would then have to account for mathematical 
theorems *and* cancer cures *and* any other kind of help that humans had, 
in the past, pressed the reward button for.

Would you say this is roughly analogous to the kind of learning you intend 
Novamente to perform?  Or perhaps even an ideal form of such learning?

What if one supplied AIXI with a goal that explicitly involved modifying its
own goal, though?


Self-modification in any form completely breaks Hutter's definition, and 
you no longer have an AIXI any

Re: [agi] unFriendly AIXI

2003-02-11 Thread Eliezer S. Yudkowsky

Ben Goertzel wrote:



Huh.  We may not be on the same page.  Using:
http://www.idsia.ch/~marcus/ai/aixigentle.pdf

Page 5:

The general framework for AI might be viewed as the design and study of
intelligent agents [RN95]. An agent is a cybernetic system with some
internal state, which acts with output yk on some environment in cycle k,
perceives some input xk from the environment and updates its internal
state. Then the next cycle follows. We split the input xk into a regular
part x0k and a reward rk, often called reinforcement feedback. From time
to time the environment provides non-zero reward to the agent.
The task of
the agent is to maximize its utility, defined as the sum of
future rewards.

I didn't see any reward function V defined for AIXI in any of the Hutter
papers I read, nor is it at all clear how such a V could be
defined, given
that the internal representation of reality produced by Solomonoff
induction is not fixed enough for any reward function to operate on it in
the same way that, e.g., our emotions bind to our own standardized
cognitive representations.


Quite literally, we are not on the same page ;)


Thought so...


Look at page 23, Definition 10 of the intelligence ordering relation
(which says what it means for one system to be more intelligent than
another).  And look at the start of Section 4.1, which Definition 10 lives
within.

The reward function V is defined there, basically as cumulative reward over
a period of time.  It's used all thru Section 4.1, and following that, it's
used mostly implicitly inside the intelligence ordering relation.


The reward function V however is *not* part of AIXI's structure; it is 
rather a test *applied to* AIXI from outside as part of Hutter's 
optimality proof.  AIXI itself is not given V; it induces V via Solomonoff 
induction on past rewards.  V can be at least as flexible as any criterion 
a (computable) human uses to determine when and how hard to press the 
reward button, nor is AIXI's approximation of V fixed at the start.  Given 
this, would you regard AIXI as formally approximating the kind of goal 
learning that Novamente is supposed to do?

As Definition 10 makes clear, intelligence is defined relative to a fixed
reward function.


A fixed reward function *outside* AIXI, so that the intelligence of AIXI 
can be defined relative to it... or am I wrong?

 What the theorems about AIXItl state is that, given a
fixed reward function, the AIXItl can do as well as any other algorithm at
achieving this reward function, if you give it computational resources equal
to those that the other algorithm got, plus a constant.  But the constant is
fucking HUGE.


Actually, I think AIXItl is supposed to do as well as a tl-bounded 
algorithm given t2^l resources... though again perhaps I am wrong.

Whether you specify the fixed reward function in its cumulative version or
not doesn't really matter...


Actually, AIXI's fixed horizon looks to me like it could give rise to some 
strange behaviors, but I think Hutter's already aware that this is 
probably AIXI's weakest link.

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] unFriendly AIXI

2003-02-11 Thread Ben Goertzel



  Given
 this, would you regard AIXI as formally approximating the kind of goal
 learning that Novamente is supposed to do?

Sorta.. but goal-learning is not the complete motivational structure of
Novamente... just one aspect

  As Definition 10 makes clear, intelligence is defined relative
 to a fixed
  reward function.

 A fixed reward function *outside* AIXI, so that the intelligence of AIXI
 can be defined relative to it... or am I wrong?

No, you're right.


   What the theorems about AIXItl state is that, given a
  fixed reward function, the AIXItl can do as well as any other
 algorithm at
  achieving this reward function, if you give it computational
 resources equal
  to those that the other algorithm got, plus a constant.  But
 the constant is
  fucking HUGE.

 Actually, I think AIXItl is supposed to do as well as a tl-bounded
 algorithm given t2^l resources... though again perhaps I am wrong.

Ah, so the constant is multiplicative rather than additive.  You're probably
right.. I haven't looked at those details for a while (I read the paper
moderately carefully several months ago, and just glanced at it briefly now
in the context of this discussion).  But that doesn't make the algorithm any
better ;-)

Now that I stretch my aged memory, I recall that Hutter's other papers give
variations on the result, e.g.

http://www.hutter1.de/ai/pfastprg.htm

gives a multiplicative factor of 5 and some additive term.  I think the
result in that paper could be put together with AIXItl though he hasn't done
so yet.

  Whether you specify the fixed reward function in its cumulative
 version or
  not doesn't really matter...

 Actually, AIXI's fixed horizon looks to me like it could give
 rise to some
 strange behaviors, but I think Hutter's already aware that this is
 probably AIXI's weakest link.

yeah, that assumption was clearly introduced to make the theorems easier to
prove.  I don't think it's essential to the theory, really.

ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] unFriendly AIXI

2003-02-11 Thread Eliezer S. Yudkowsky

Ben Goertzel wrote:


Yeah, you're right, I mis-spoke.  The theorems assume the goal function is
known in advance -- but not known to the system, just known to the entity
defining and estimating the system's intelligence and giving the rewards.

I was implicitly assuming the case in which the goal was encapsulated in a
goal-definition program of some sort, which was hooked up to AIXI in
advance; but that is not the only case.


Actually, there's no obvious way you could ever include V in AIXI, at all. 
 V would have to operate as a predicate on internal representations of 
reality that have no fixed format or pattern.  At most you might be able 
to define a V that operates as a predicate on AIXI's inputs, in which case 
you can dispense with the separate reward channel.  In fact this is 
formally equivalent to AIXI, since it equates to an AIXI with an input 
channel I and a reward channel that is deterministically V(I).

It's a very different sort of setup than Novamente, because

1) a Novamente will be allowed to modify its own goals based on its
experience.


Depending on the pattern of inputs and rewards, AIXI will modify its
internal representation of the algorithm which it expects to determine
future rewards.  Would you say that this is roughly analogous to
Novamente's learning of goals based on experience, or is there in your
view a fundamental difference?  And if so, is AIXI formally
superior or in some way inferior to Novamente?


Well, AIXI is superior to any computable algorithm, in a sense.  If you had
the infinite-computing-power hardware that it requires, it would be pretty
damn powerful ;-p  But so would a lot of other approaches!!  Infinite
computing power provides AI's with a lot of axle grease!!


Obviously it is not AIXI's purpose to be implemented.  What AIXI defines 
rather is an abstraction that lets us talk more easily about certain kinds 
of intelligence.  If any AI program we could conceivably want to build is 
an imperfect approximation of AIXI, that is an interesting property of 
AIXI.  If an AI program we want to build is *superior* to AIXI then that 
is an *extremely* interesting property.

The reason I asked the question was not to ask whether AIXI is 
pragmatically better as a design strategy than Novamente.  What I was 
asking you rather is if, looking at AIXI, you see something *missing* that 
would be present in Novamente.  In other words, *if* you had an infinitely 
powerful computer processor, is there a reason why you would *not* 
implement AIXI on it, and would instead prefer Novamente, even if it had 
to run on a plain old cluster?

If the purpose of spontaneous behavior is to provoke learning
experiences,
this behavior is implicit in AIXI as well, though not obviously so.  I'm
actually not sure about this because Hutter doesn't explicitly
discuss it.


Well, you could argue that if Novamente is so good, AIXI will eventually
figure out how to emulate Novamente, since Novamente is just one of the many
programs in the space it searches!!

I am really not very interested in comparing AIXI to Novamente, because they
are not comparable: AIXI assumes infinite computing power and Novamente does
not.


We aren't comparing AIXI's design to Novamente's design so much as we're 
comparing AIXI's *kind of intelligence* to Novamente's *kind of 
intelligence*.  Does Novamente have something AIXI is missing?  Or does 
AIXI have strictly more intelligence than Novamente?

Actually, given the context of Friendliness, what we're interested in is 
not so much intelligence as interaction with humans; under this view, 
for example, giving humans a superintelligently deduced cancer cure is 
just one way of interacting with humans.  Looking at AIXI and Novamente, 
do you see any way that Novamente interacts with humans in a way that AIXI 
cannot?

AIXItl, on the other hand, is a finite-computing-power program.  In
principle it can demonstrate spontaneous behaviors, but in practice, I think
it will not demonstrate many interesting spontaneous behaviors.  Because it
will spend all its time dumbly searching through a huge space of useless
programs!!

Also, not all of Novamente's spontaneous behaviors are even implicitly
goal-directed.  Novamente is a goal-oriented but not 100% goal-directed
system, which is one major difference from AIXI and AIXItl.


I agree that it is a major difference; does it mean that Novamente can 
interact with humans in useful or morally relevant ways of which AIXI is 
incapable?

 But it looks to me like AIXI, under its formal definition, emergently
exhibits curiosity wherever there are, for example, two equiprobable
models of reality which determine different rewards and can be
distinguished by some test.  What we interpret as spontaneous behavior
would then emerge from a horrendously uncomputable exploration of all
possible realities to find tests which are ultimately likely to result in
distinguishing data, but in ways which are not at all obvious to
any human

Re: [agi] unFriendly AIXI

2003-02-11 Thread Eliezer S. Yudkowsky

Eliezer S. Yudkowsky wrote:


Not really.  There is certainly a significant similarity between Hutter's
stuff and the foundations of Novamente, but there are significant
differences too.  To sort out the exact relationship would take me 
more than a few minutes' thought.

There are indeed major differences in the foundations.  Is there 
something useful or important that Novamente does, given its 
foundations, that you could not do if you had a physically realized 
infinitely powerful computer running Hutter's stuff?

Actually, you said that it would take you more than a few minutes thought 
to sort it all out, so let me ask a question which you can hopefully 
answer more quickly...

Do you *feel intuitively* that there is something useful or important 
Novamente does, given its foundations, that you could not do if you had a 
physically realized AIXI?

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] unFriendly AIXI

2003-02-11 Thread Philip Sutton

Eliezer,

In this discussion you have just moved the focus to the superiority of 
one AGI approach versus another in terms of *interacting with 
humans*.

But once one AGI exists it's most likely not long before there are more 
AGIs and there will need to be a moral/ethical system to guide AGI-AGI 
interaction.  And with super clever AGIs around it's likely that that 
human modification speeds up leading the category 'human' to be a 
very loose term.  So we need a moral/ethical system to guide AGI-
once-were-human interactions.

So for these two reasons alone I think we need to start out thinking in 
more general terms that AGIs being focussed on 'interacting with 
humans'.

If you have an goal-modifying AGI it might figure this all out.  But why 
should the human designers/teachers not avoid the probem in the first 
place since were can anticipate the issue already fairly easily.

Of coursei n terms of the 'unFriendly AIXI' debate this issue of a tight 
focus on interaction with humans is of no significance, but it I think it is 
important in its own right. 

Cheers, Philip

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] unFriendly AIXI

2003-02-11 Thread Ben Goertzel


Hi,

 The reason I asked the question was not to ask whether AIXI is
 pragmatically better as a design strategy than Novamente.  What I was
 asking you rather is if, looking at AIXI, you see something
 *missing* that
 would be present in Novamente.  In other words, *if* you had an
 infinitely
 powerful computer processor, is there a reason why you would *not*
 implement AIXI on it, and would instead prefer Novamente, even if it had
 to run on a plain old cluster?

These are deep and worthwhile questions that I can't answer thoroughly off
the cuff, I'll have to put some thought into them and reply a little later.

There are other less fascinating but more urgent things in the queue
tonight,
alas ;-p

My intuitive feeling is that I'd rather implement Novamente but with AIXI
plugged in as the schema/predicate learning component.  In other words,
it's
clear that an infinitely capable procedure learning routine would be very
valuable for AGI.  But I don't really like AIXI's overall control structure,
and I need to think a bit about why.  ONE reason is that it's insanely
inefficient, but even if you remove consideration of efficiency, there may
be other problems with it too.


 Actually, given the context of Friendliness, what we're interested in is
 not so much intelligence as interaction with humans; under this view,
 for example, giving humans a superintelligently deduced cancer cure is
 just one way of interacting with humans.  Looking at AIXI and Novamente,
 do you see any way that Novamente interacts with humans in a way
 that AIXI  cannot?

Well, off the cuff, I'm not sure because I've thought about Novamente a lot
more than I've thought about AIXI.

I'll need to mull this over  It's certainly worth thinking about.

Novamente is fundamentally self-modifying (NOT the current codebase but
the long-term design).  Based on feedback from humans and its own self-
organization, it can completely revise its own codebase.  AIXI can't do
that.

Along with self-modification comes the ability to modify its
reward/punishment
receptors, and interpret what formerly would have been a reward as a
punishment...
[This won't happen often but is in principle a possibility]

I don't know if this behavior is in AIXI's repertoire... is it?

  Also, not all of Novamente's spontaneous behaviors are even implicitly
  goal-directed.  Novamente is a goal-oriented but not 100% goal-directed
  system, which is one major difference from AIXI and AIXItl.

 I agree that it is a major difference; does it mean that Novamente can
 interact with humans in useful or morally relevant ways of which AIXI is
 incapable?

Maybe... hmmm.

   In that case you cannot prove any of Hutter's
  theorems about them.  And if you can't prove theorems about
 them then they
  are nothing more than useless abstractions.  Since AIXI can never be
  implemented and AIXItl is so inefficient it could never do
 anything useful
  in practice.

 But they are very useful tools for talking about fundamental kinds of
 intelligence.

I am not sure whether they are or not.

  Well, sure ... it's *roughly analogous*, in the sense that it's
 experiential
  reinforcement learning, sure.

 Is it roughly analogous, but not really analogous, in the sense that
 Novamente can do something AIXI can't?

Well, Novamente will not follow the expectimax algorithm.  So it will
display
behaviors that AIXI will never display.

I'm having trouble, off the cuff and in a hurry, thinking about AIXI in the
context
of a human saying to it In my view, you should adjust your goal system for
this
reason

If a human says this to Novamente, it may consider the request and may
do so.  It may do so if this human has been right about a lot of things in
the past,
for example.

If a human says this to AIXI, how does AIXI react and why?  AIXI doesn't
have a goal
system in the same sense that Novamente does.  AIXI if it's smart enough co
uld
hypothetically figure out what the human meant and use this to modify its
current
operating program (but not its basic program-search mechanism, because AIXI
is not
self-modifying in such a strong sense)... if its history told it that
listening to
humans causes it to get rewarded.  But, it seems to me intuitively that the
modification AIXI would make in this case, would not constrain or direct
AIXI's
future development as strongly as the modification Novamente would make in
response
to the same human request.  I'm not 100% sure about this though, because my
mental model
of AIXI's dynamics is not that good, and I haven't tried to do the math
corresponding
to this scenario.

What do you think about AIXI's response to this scenario, Eliezer?

You seem to have your head more fully wrapped around AIXI than I do, at the
moment ;-)

I really should reread the paper, but I don't have time right now.

This little scenario I've just raised does NOT exhaust the potentially
important differences
between Novamente and AIXI, it's just one thing that happened to occur to

Re: [agi] unFriendly AIXI

2003-02-11 Thread Eliezer S. Yudkowsky

Bill Hibbard wrote:

On Tue, 11 Feb 2003, Ben Goertzel wrote:


Eliezer wrote:


Interesting you should mention that.  I recently read through Marcus
Hutter's AIXI paper, and while Marcus Hutter has done valuable work on a
formal definition of intelligence, it is not a solution of Friendliness
(nor do I have any reason to believe Marcus Hutter intended it as one).

In fact, as one who specializes in AI morality, I was immediately struck
by two obvious-seeming conclusions on reading Marcus Hutter's formal
definition of intelligence:

1)  There is a class of physically realizable problems, which humans can
solve easily for maximum reward, but which - as far as I can tell - AIXI
cannot solve even in principle;


I don't see this, nor do I believe it...


I don't believe it either. Is this a reference to Penrose's
argument based on Goedel's Incompleteness Theorem (which is
wrong)?


Oh, well, in that case, I'll make my statement more formal:

There exists a physically realizable, humanly understandable challenge C 
on which a tl-bounded human outperforms AIXI-tl for humanly understandable 
reasons.  Or even more formally, there exists a computable process P 
which, given either a tl-bounded uploaded human or an AIXI-tl, supplies 
the uploaded human with a greater reward as the result of strategically 
superior actions taken by the uploaded human.

:)

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] unFriendly AIXI

2003-02-11 Thread Ben Goertzel


 Oh, well, in that case, I'll make my statement more formal:

 There exists a physically realizable, humanly understandable challenge C
 on which a tl-bounded human outperforms AIXI-tl for humanly
 understandable
 reasons.  Or even more formally, there exists a computable process P
 which, given either a tl-bounded uploaded human or an AIXI-tl, supplies
 the uploaded human with a greater reward as the result of strategically
 superior actions taken by the uploaded human.

 :)

 --
 Eliezer S. Yudkowsky

Hmmm.

Are you saying that given a specific reward function and a specific
environment,
the t1-bounded uploaded human with resources (t,l) will act so as to
maximize the reward function
better than AIXI-tl with resources (T,l) with T as specified by Hutter's
theorem of AIXI-tl optimality?

Presumably you're not saying that, because it would contradict his theorem?

So what clever loophole are you invoking?? ;-)

ben



---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] unFriendly AIXI

2003-02-11 Thread Eliezer S. Yudkowsky

Ben Goertzel wrote:
 Oh, well, in that case, I'll make my statement more formal:

 There exists a physically realizable, humanly understandable
 challenge C on which a tl-bounded human outperforms AIXI-tl for
 humanly understandable reasons.  Or even more formally, there exists
 a computable process P which, given either a tl-bounded uploaded
 human or an AIXI-tl, supplies the uploaded human with a greater
 reward as the result of strategically superior actions taken by the
 uploaded human.

 :)

 -- Eliezer S. Yudkowsky

 Hmmm.

 Are you saying that given a specific reward function and a specific
 environment, the t1-bounded uploaded human with resources (t,l) will
 act so as to maximize the reward function better than AIXI-tl with
 resources (T,l) with T as specified by Hutter's theorem of AIXI-tl
 optimality?

 Presumably you're not saying that, because it would contradict his
 theorem?

Indeed.  I would never presume to contradict Hutter's theorem.

 So what clever loophole are you invoking?? ;-)

An intuitively fair, physically realizable challenge with important 
real-world analogues, solvable by the use of rational cognitive reasoning 
inaccessible to AIXI-tl, with success strictly defined by reward (not a 
Friendliness-related issue).  It wouldn't be interesting otherwise.

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] unFriendly AIXI

2003-02-11 Thread Ben Goertzel


   So what clever loophole are you invoking?? ;-)

 An intuitively fair, physically realizable challenge with important
 real-world analogues, solvable by the use of rational cognitive reasoning
 inaccessible to AIXI-tl, with success strictly defined by reward (not a
 Friendliness-related issue).  It wouldn't be interesting otherwise.

 --
 Eliezer S. Yudkowsky

Well, when you're ready to spill, we're ready to listen ;)

I am guessing it utilizes the reward function in an interesting sort of
way...

ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] unFriendly AIXI

2003-02-11 Thread Ben Goertzel



 It seems to me that this answer *assumes* that Hutter's work is completely
 right, an assumption in conflict with the uneasiness you express in your
 previous email.

It's right as mathematics...

I don't think his definition of intelligence is the  maximally useful one,
though I think it's a reasonably OK one.

I have proposed a different but related definition of intelligence, before,
and have not been entirely satisfied with my own definition, either.  I like
mine better than Hutter's... but I have not proved any cool theorems about
mine...

 If Novamente can do something AIXI cannot, then Hutter's
 work is very highly valuable because it provides a benchmark against which
 this becomes clear.

 If you intuitively feel that Novamente has something AIXI doesn't, then
 Hutter's work is very highly valuable whether your feeling proves correct
 or not, because it's by comparing Novamente against AIXI that you'll learn
 what this valuable thing really *is*.  This holds true whether the answer
 turns out to be It's capability X that I didn't previously really know
 how to build, and hence didn't see as obviously lacking in AIXI or It's
 capability X that I didn't previously really know how to build, and
 hence didn't see as obviously emerging from AIXI.

 So do you still feel that Hutter's work tells you nothing of any use?

Well, it hasn't so far.

It may in the future.  If it does I'll say so ;-)

The thing is, I (like many others) thought of algorithms equivalent to AIXI
years ago, and dismissed them as useless.  What I didn't do is prove
anything
about these algorithms, I just thought of them and ignored them  Partly
because I didn't see how to prove the theorems, and partly because I thought
even once I proved the theorems, I wouldn't have anything pragmatically
useful...

-- Ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] unFriendly AIXI

2003-02-11 Thread Eliezer S. Yudkowsky

Ben Goertzel wrote:

 It's right as mathematics...

 I don't think his definition of intelligence is the  maximally useful
 one, though I think it's a reasonably OK one.

 I have proposed a different but related definition of intelligence,
 before, and have not been entirely satisfied with my own definition,
 either.  I like mine better than Hutter's... but I have not proved any
 cool theorems about mine...

Can Hutter's AIXI satisfy your definition?

 If Novamente can do something AIXI cannot, then Hutter's work is very
 highly valuable because it provides a benchmark against which this
 becomes clear.

 If you intuitively feel that Novamente has something AIXI doesn't,
 then Hutter's work is very highly valuable whether your feeling
 proves correct or not, because it's by comparing Novamente against
 AIXI that you'll learn what this valuable thing really *is*.  This
 holds true whether the answer turns out to be It's capability X that
 I didn't previously really know how to build, and hence didn't see as
 obviously lacking in AIXI or It's capability X that I didn't
 previously really know how to build, and hence didn't see as
 obviously emerging from AIXI.

 So do you still feel that Hutter's work tells you nothing of any use?

 Well, it hasn't so far.

 It may in the future.  If it does I'll say so ;-)

 The thing is, I (like many others) thought of algorithms equivalent to
 AIXI years ago, and dismissed them as useless.  What I didn't do is
 prove anything about these algorithms, I just thought of them and
 ignored them  Partly because I didn't see how to prove the
 theorems, and partly because I thought even once I proved the theorems,
 I wouldn't have anything pragmatically useful...

It's not *about* the theorems.  It's about whether the assumptions
**underlying** the theorems are good assumptions to use in AI work.  If
Novamente can outdo AIXI then AIXI's assumptions must be 'off' in some way
and knowing this *explicitly*, as opposed to having a vague intuition
about it, cannot help but be valuable.

Again, it sounds to me like, in this message, you're taking for *granted*
that AIXI and Novamente have the same theoretical foundations, and that
hence the only issue is design and how much computing power is needed, in
which case I can see why it would be intuitively straightforward to you
that (a) Novamente is a better approach than AIXI and (b) AIXI has nothing
to say to you about the pragmatic problem of designing Novamente, nor are
its theorems relevant in building Novamente, etc.  But that's exactly the
question I'm asking you.  *Do* you believe that Novamente and AIXI rest on
the same foundations?

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] unFriendly AIXI... and Novamente?

RE: [agi] unFriendly AIXI... and Novamente?

RE: [agi] unFriendly AIXI... and Novamente?

Re: [agi] unFriendly AIXI... and Novamente?

Re: [agi] unFriendly AIXI... and Novamente?

Re: [agi] unFriendly AIXI... and Novamente?

Re: [agi] unFriendly AIXI... and Novamente?

Re: [agi] unFriendly AIXI

RE: [agi] unFriendly AIXI

Re: [agi] unFriendly AIXI

RE: [agi] unFriendly AIXI

RE: [agi] unFriendly AIXI

RE: [agi] unFriendly AIXI

Re: [agi] unFriendly AIXI

RE: [agi] unFriendly AIXI

RE: [agi] unFriendly AIXI

Re: [agi] unFriendly AIXI

Re: [agi] unFriendly AIXI

RE: [agi] unFriendly AIXI

Re: [agi] unFriendly AIXI

Re: [agi] unFriendly AIXI

Re: [agi] unFriendly AIXI

RE: [agi] unFriendly AIXI

Re: [agi] unFriendly AIXI

RE: [agi] unFriendly AIXI

Re: [agi] unFriendly AIXI

RE: [agi] unFriendly AIXI

RE: [agi] unFriendly AIXI

Re: [agi] unFriendly AIXI

29 matches

Site Navigation

Mail list logo

Footer information