RE: [agi] Breaking AIXI-tl

Ben Goertzel Fri, 14 Feb 2003 05:43:11 -0800


Hi Eliezer


Some replies to "side points":

> This is a critical class of problem for would-be implementors of
> Friendliness.  If all AIs, regardless of their foundations, did sort of
> what humans would do, given that AI's capabilities, the whole world would
> be a *lot* safer.

Hmmm.  I don't believe that.  There is a lot of variation in human
psychology, and some humans are pretty damn dangerous.  Also there is the
maxim "power corrupts, and absolute power corrupts absolutely" which tells
you something about human psychology.  A human with superintelligence and
superpowers could be a great thing or a terrible thing -- it's hard to
balance this unknown outcome against the unknown outcome of an AGI.

>  > In this way Novamentes will be more like humans, but with the
>  > flexibility to change their hard-wired motivators as well, if they
>  > REALLY want to...
>
> And what they do with that flexibility will be totally unlike what you
> would do in that situation,

Well, yeah....  Of course.  Novamente is not a model of the human
brain-mind, and its behavior will almost always be different than that of
humans.

Ethically speaking, I don't consider human behavior a tremendously great
model anyway.  Read the damn newspaper!!  We are quite possibly on a path to
self-destruction through rampant unethical violent behavior...

> The task of AGI is not to see that the computers in front of us
> "could" do
> something, but to figure out what are the key differences that we must
> choose among to make them actually do it.  This holds for Friendliness as
> well.  That's why I worry when you see Friendliness in AIXI that isn't
> there.  AIXI "could" be Friendly, in the sense that it is capable of
> simulating Friendly minds; and it's possible to toss off a loose argument
> that AIXI's control process will arrive at Friendliness.  But AIXI will
> not end up being Friendly, no matter what the pattern of inputs and
> rewards.  And what I'm afraid of is that neither will Novamente.

Well, first of all, there is not terribly much relation btw AIXI/AIXItl and
Novamente, so what you show about the former system means very little about
the latter.

As for the Friendliness of AIXI/AIXItl, it is obvious that an AIXI/AIXItl
system will never have a deepest-level implicit or explicit supergoal that
is *ethical*.  Its supergoal is just to maximize its reward.  Period.  So it
can act beneficially to humans for an arbitrarily long period of time, if
its reward structure has been set up that way.

By positing an AIXI/AIXItl system that is connected with a specific reward
mechanism (e.g. a button pushed by humans, an electronic sensor that is part
of a robot body, etc.) you are then positing something beyond vanilla
AIXI/AIXItl: you're positing an AIXI/AIXItl that is embedded in the world in
some way.

The notion of Friendliness does not exist on the level of pure, abstract
AIXI/AIXItl, does it?  It exists on the level of world-embedded AIXI/AIXItl.
And once you're looking at world-embedded AIXI/AIXItl, you no longer have a
purely formal characterization of AIXI/AIXItl, do you?

Ben

-------
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

Reply via email to