On Friday 01 December 2006 23:42, Richard Loosemore wrote:

> It's a lot easier than you suppose.  The system would be built in two
> parts:  the motivational system, which would not change substantially
> during RSI, and the "thinking part" (for want of a better term), which
> is where you do all the improvement.

For concreteness, I have called these the Utility Function and World Model in 
my writings on the subject...

A plan that says "Let RSI consist of growing the WM and not the UF" suffers 
from the problem that the sophistication of the WM's understanding soon makes 
the UF look crude and stupid. Human babies want food, proximity to their 
mothers, and are frightened of strangers. That's good for babies but a person 
with greater understanding and capabilities is better off (and the rest of us 
are better off if the person has) a more sophisticated UF as well.

> It is not quite a contradiction, but certainly this would be impossible:
>   deciding to make a modification that clearly was going to leave it
> wanting something that, if it wanted that thing today, would contradict
> its current priorities.  Do you see why?  The motivational mechanism IS
> what the system wants, it is not what the system is considering wanting.

This is a good first cut at the problem, and is taken by e.g. Nick Bostrom in 
a widely cited paper at http://www.nickbostrom.com/ethics/ai.html

> The system is not protecting current beliefs, it is believing its
> current beliefs.  Becoming more capable of understanding the "reality"
> it is immersed in?  You have implicitly put a motivational priority in
> your system when you suggest that that is important to it ... does that
> rank higher than its empathy with the human race?
>
> You see where I am going:  there is nothing god-given about the desire
> to "understand reality" in a better way.  That is just one more
> candidate for a motivational priority.

Ah, but consider: knowing more about how the world works is often a valuable 
asset to the attempt to increase the utility of the world, *no matter* what 
else the utility function might specify. 

Thus, a system's self-modification (or evolution in general) is unlikely to 
remove curiosity / thirst for knowledge / desire to improve one's WM as a 
high utility even as it changes other things. 

There are several such properties of a utility function that are likely to be 
invariant under self-improvement or evolution. It is by the use of such 
invariants that we can design self-improving AIs with reasonable assurance of 
their continued beneficence.

--Josh

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Reply via email to