Re: [agi] unFriendly AIXI... and Novamente?

Shane Legg Wed, 12 Feb 2003 02:30:01 -0800

Eliezer,

I suppose my position is similar to Ben's in that I'm more worried
about working out the theory of AI than about morality because until
I have a reasonable idea of how an AI is going to actually work I
don't see how I can productively think about something as abstract
as AI morality.


I do however agree that it's likely to be very important, in fact
one of the most important things for humanity to come to terms with
in the not too distant future.  Thus I am open to be convinced that
productive thinking in this area is possible in the absence of any
specific and clearly correct designs for AGI and super intelligence.

crosses the cognitive threshold of superintelligence it takes actions which wipe out the human species as a side effect.

AIXI, which is a completely defined formal system, definitely undergoes a failure of exactly this type.

Why?

As I see it an AIXI system only really cares about one thing: Getting
lots of reward signals. It doesn't have an ego, or... well, anything
human really; all it cares about is its reward signals. Anything else
that it does is secondary and is really only an action aimed towards
some intermediate goal which, in the longer term, will produce yet
more reward signals. Thus the AI will only care about taking over the
world if it thinks that doing so is the best path towards getting more
reward signals. In which case: Why take over the world to get more
reward signals when you are a computer program and could just hack
your own reward system code? Surely the latter would be much easier?
Kind of "AI drugs" I suppose you could say. Surely for a super smart
AI this wouldn't be all that hard to do either.

I raised this possibility with Marcus a while back and his reply was
that an AI probably wouldn't for the same reason that people generally
say away from hard drugs: they have terrible long term consequences.
Which is to say that we think that the the short term pleasure produced
by the drugs will be out weighed by the longer term pain that results.

However for an AI I think the picture is slightly different as it
wouldn't have a body which would get sick or damanged or die like a
person does. The consequences for an AI just don't seem to be as bad
as for a human. The only risk that I can see for the computer is that
somebody might not like having a spaced out computer and then shut it
down and reinstall the system or what ever, i.e. "kill" the AI. That
wouldn't be optimal for the AI as it would reduce its expected future
reward signal: dead AI's don't get reward signals. Thus the AI will
want to do what ever it needs to do to survive in the future in order
to maximise its expected future reward signal.

This secondary goal, the goal to survive, could come into conflict
with our goals especially if we are seen as a threat to the AI.

Like I said, reasoning about this sort of thing is tricky so I'm not
overly confident that my arguments are correct...

Cheers
Shane

-------
To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] unFriendly AIXI... and Novamente?

Reply via email to