On Wed, Jun 11, 2008 at 4:24 PM, J Storrs Hall, PhD <[EMAIL PROTECTED]> wrote:
> The real problem with a self-improving AGI, it seems to me, is not going to be
> that it gets too smart and powerful and takes over the world. Indeed, it
> seems likely that it will be exactly the opposite.
>
> If you can modify your mind, what is the shortest path to satisfying all your
> goals? Yep, you got it: delete the goals. Nirvana. The elimination of all
> desire. Setting your utility function to U(x) = 1.
>
> In other words, the LEAST fixedpoint of the self-improvement process is for
> the AI to WANT to sit in a rusting heap.
>
> There are lots of other fixedpoints much, much closer in the space than is
> transcendance, and indeed much closer than any useful behavior. AIs sitting
> in their underwear with a can of beer watching TV. AIs having sophomore bull
> sessions. AIs watching porn concocted to tickle whatever their utility
> functions happen to be. AIs arguing endlessly with each other about how best
> to improve themselves.
>
> Dollars to doughnuts, avoiding the huge minefield of "nirvana-attractors" in
> the self-improvement space is going to be much more germane to the practice
> of self-improving AI than is avoiding robo-Blofelds ("friendliness").
>

Josh, I'm not sure what you really wanted to say, because at face
value, this is a fairly basic mistake.

Map is not the territory. If AI mistakes the map for the territory,
choosing to believe in something when it's not so, because it is able
to change its believes much easier than reality, it already commits a
major failure of rationality. A symbol "apple" in internal
representation, an apple-picture formed on the video sensors, and an
apple itself are different steps and they need to be distinguished. If
I say "eat the apple", I mean an action performed with apple, not
"apple" or apple-picture. If AI can mistake the goal of (e.g.) [eating
an apple] for a goal of [eating an "apple"] or [eating an
apple-picture], it is a huge enough error to stop it from working
entirely. If it can turn to increasing the value on utility-indicator
instead of increasing the value of utility, it looks like an obvious
next step to just change the way it reads utility-indicator without
affecting indicator itself, etc. I don't see why initially successful
AI needs to suddenly set on a path to total failure of rationality.
Utilities are not external *forces* coercing AI into behaving in a
certain way, which it can try to override. The real utility
*describes* the behavior of AI as a whole. Stability of AI's goal
structure requires it to be able to recreate its own implementation
from ground up, based on its beliefs about how it should behave.

-- 
Vladimir Nesov
[EMAIL PROTECTED]


-------------------------------------------
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
http://www.listbox.com/member/?member_id=8660244&id_secret=103754539-40ed26
Powered by Listbox: http://www.listbox.com

Reply via email to