Re: [agi] unFriendly AIXI

Eliezer S. Yudkowsky Tue, 11 Feb 2003 09:45:36 -0800

Eliezer S. Yudkowsky wrote:

I recently read through Marcus Hutter's AIXI paper, and while Marcus Hutter has done valuable work on a formal definition of intelligence, it is not a solution of Friendliness (nor do I have any reason to believe Marcus Hutter intended it as one).

In fact, as one who specializes in AI morality, I was immediately struck by two obvious-seeming conclusions on reading Marcus Hutter's formal definition of intelligence:

1) There is a class of physically realizable problems, which humans can solve easily for maximum reward, but which - as far as I can tell - AIXI cannot solve even in principle;

2) While an AIXI-tl of limited physical and cognitive capabilities might serve as a useful tool, AIXI is unFriendly and cannot be made Friendly regardless of *any* pattern of reinforcement delivered during childhood.

Before I post further, is there *anyone* who sees this besides me?

Also, let me make clear why I'm asking this. AIXI and AIXI-tl are formal definitions; they are *provably* unFriendly. There is no margin for handwaving about future revisions of the system, emergent properties of the system, and so on. A physically realized AIXI or AIXI-tl will, provably, appear to be compliant up until the point where it reaches a certain level of intelligence, then take actions which wipe out the human species as a side effect. The most critical theoretical problems in Friendliness are nonobvious, silent, catastrophic, and not inherently fun for humans to argue about; they tend to be structural properties of a computational process rather than anything analogous to human moral disputes. If you are working on any AGI project that you believe has the potential for real intelligence, you are obliged to develop professional competence in spotting these kinds of problems. AIXI is a formally complete definition, with no margin for handwaving about future revisions. If you can spot catastrophic problems in AI morality you should be able to spot the problem in AIXI. Period. If you cannot *in advance* see the problem as it exists in the formally complete definition of AIXI, then there is no reason anyone should believe you if you afterward claim that your system won't behave like AIXI due to unspecified future features.

--
Eliezer S. Yudkowsky http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

-------
To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] unFriendly AIXI

Reply via email to