RE: [agi] Breaking AIXI-tl

Billy Brown Thu, 20 Feb 2003 08:22:58 -0800

Ben Goertzel wrote:
> Agreed, except for the "very modest resources" part.  AIXI could
> potentially accumulate pretty significant resources pretty quickly.


Agreed. But if the AIXI needs to dissassemble the planet to build its
defense mechanism, the fact that it is harmless afterwards isn't going to be
much consolation to us. So, we only survive if the resources needed for the
perfect defense are small enough that the construction project doesn't wipe
us out as a side effect.

> This exploration makes the (fairly obvious, I guess) point that the
problem
> with AIXI Friendliness-wise is its simplistic goal architecture (the
reward
> function) rather than its learning mechanism.

Well, I agree that this particular problem is a result of the AIXI's goal
system architecture, but IMO the same problem occurs in a wide range of
other goal systems I've seen proposed on this list. The root of the problem
is that the thing we would really like to reward the system for, human
satisfaction with its performance, is not a physical quantity that can be
directly measured by a reward mechanism. So it is very tempting to choose
some external phenomenon, like smiles or verbal expressions of satisfaction,
as a proxy. Unfortunately, any such measurement can be subverted once the AI
becomes good at modifying its physical surroundings, and an AI with this
kind of goal system has no motivation not to wirehead itself.

To avoid the problem entirely, you have to figure out how to make an AI that
doesn't want to tinker with its reward system in the first place. This, in
turn, requires some tricky design work that would not necessarily seem
important unless one were aware of this problem. Which, of course, is the
reason I commented on it in the first place.

Billy Brown

-------
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

Reply via email to