RE: [agi] Breaking AIXI-tl

Ben Goertzel Wed, 19 Feb 2003 08:00:18 -0800

> This seems to be a non-sequitor. The weakness of AIXI is not that it's
> goals don't change, but that it has no goals other than to maximize an
> externally given reward. So it's going to do whatever it predicts will
> most efficiently produce that reward, which is to coerce or subvert
> the evaluator.


I'm not sure why an AIXI, rewarded for pleasing humans, would learn an
operating program leading it to hurt or annihilate humans, though.

It might learn a program involving actually doing beneficial acts for humans

Or, it might learn a program that just tells humans what they want to hear,
using its superhuman intelligence to trick humans into thinking that hearing
its soothing words is better than having actual beneficial acts done.

I'm not sure why you think the latter is more likely than the former.  My
guess is that the former is more likely.  It may require a simpler program
to please humans by benefiting them, than to please them by tricking them
into thinking they're being benefited....

> If you start with such a goal, I don't see how allowing the
> system to change its goals is going to help.

Sure, you're right, if pleasing an external evaluator is the ONLY goal of a
system, and the system's dynamics are entirely goal-directed, then there is
no way to introduce goal-change into the system except randomly...

Novamente is different because it has multiple initial goals, and because
its behavior is not entirely goal-directed.  In these regards Novamente is
more human-brain-ish.

> But I think Eliezer's real point, which I'm not sure has come across, is
> that if you didn't spot such an obvious flaw right away, maybe you
> shouldn't trust your intuitions about what is safe and what is not.

Yes, I understood and explicitly responded to that point before.

Still, even after hearing you and Eliezer repeat the above argument, I'm
still not sure it's correct.

However, my intuitions about the safety of AIXI, which I have not thought
much about, are worth vastly less than  my intuitions about the safety of
Novamente, which I've been thinking about and working with for years.

Furthermore, my stated intention is NOT to rely on my prior intuitions to
assess the safety of my AGI system.  I don't think that anyone's prior
intuitions about AI safety are worth all that much, where a complex system
like Novamente is concerned.  Rather, I think that once Novamente is a bit
further along -- at the "learning baby" rather than "partly implemented
baby" stage -- we will do experimentation that will give us the empirical
knowledge needed to form serious opinions about safety (Friendliness).

-- Ben G


-------
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

Reply via email to