Re: [agi] Breaking AIXI-tl

Shane Legg Wed, 12 Feb 2003 14:38:24 -0800

Eliezer,

Yes, this is a clever argument.  This problem with AIXI has been
thought up before but only appears, at least as far as I know, in
material that is currently unpublished.  I don't know if anybody
has analysed the problem in detail as yet... but it certainly is
a very interesting question to think about:


What happens when two super intelligent AIXI's meet?

I'll have to think about this for a while before I reply.

Also, you mentioned that it was in your opinion trivial to see that
an AIXI type system would turn into an unfriendly AI.  I'm still
interested to see this argument spelled out, especially if you think
it's a relatively simple argument.

Cheers
Shane


Eliezer S. Yudkowsky wrote:

Okay, let's see, I promised:

An intuitively fair, physically realizable challenge, with important real-world analogues, formalizable as a computation which can be fed either a tl-bounded uploaded human or an AIXI-tl, for which the human enjoys greater success measured strictly by total reward over time, due to the superior strategy employed by that human as the result of rational reasoning of a type not accessible to AIXI-tl.

Roughly speaking:

A (selfish) human upload can engage in complex cooperative strategies with an exact (selfish) clone, and this ability is not accessible to AIXI-tl, since AIXI-tl itself is not tl-bounded and therefore cannot be simulated by AIXI-tl, nor does AIXI-tl have any means of abstractly representing the concept "a copy of myself". Similarly, AIXI is not computable and therefore cannot be simulated by AIXI. Thus both AIXI and AIXI-tl break down in dealing with a physical environment that contains one or more copies of them. You might say that AIXI and AIXI-tl can both do anything except recognize themselves in a mirror.

The simplest case is the one-shot Prisoner's Dilemna against your own exact clone. It's pretty easy to formalize this challenge as a computation that accepts either a human upload or an AIXI-tl. This obviously breaks the AIXI-tl formalism. Does it break AIXI-tl? This question is more complex than you might think. For simple problems, there's a nonobvious way for AIXI-tl to stumble onto incorrect hypotheses which imply cooperative strategies, such that these hypotheses are stable under the further evidence then received. I would expect there to be classes of complex cooperative problems in which the chaotic attractor AIXI-tl converges to is suboptimal, but I have not proved it. It is definitely true that the physical problem breaks the AIXI formalism and that a human upload can straightforwardly converge to optimal cooperative strategies based on a model of reality which is more correct than any AIXI-tl is capable of achieving.

Ultimately AIXI's decision process breaks down in our physical universe because AIXI models an environmental reality with which it interacts, instead of modeling a naturalistic reality within which it is embedded. It's one of two major formal differences between AIXI's foundations and Novamente's. Unfortunately there is a third foundational difference between AIXI and a Friendly AI.

-------
To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

Reply via email to