On Wed, Jun 11, 2008 at 4:24 PM, J Storrs Hall, PhD <[EMAIL PROTECTED]> wrote:
> The real problem with a self-improving AGI, it seems to me, is not going to be
> that it gets too smart and powerful and takes over the world. Indeed, it
> seems likely that it will be exactly the opposite.
>
> If you can modify your mind, what is the shortest path to satisfying all your
> goals? Yep, you got it: delete the goals. Nirvana. The elimination of all
> desire. Setting your utility function to U(x) = 1.
>
> In other words, the LEAST fixedpoint of the self-improvement process is for
> the AI to WANT to sit in a rusting heap.
>
> There are lots of other fixedpoints much, much closer in the space than is
> transcendance, and indeed much closer than any useful behavior. AIs sitting
> in their underwear with a can of beer watching TV. AIs having sophomore bull
> sessions. AIs watching porn concocted to tickle whatever their utility
> functions happen to be. AIs arguing endlessly with each other about how best
> to improve themselves.
>
> Dollars to doughnuts, avoiding the huge minefield of "nirvana-attractors" in
> the self-improvement space is going to be much more germane to the practice
> of self-improving AI than is avoiding robo-Blofelds ("friendliness").
>Josh, I'm not sure what you really wanted to say, because at face value, this is a fairly basic mistake. Map is not the territory. If AI mistakes the map for the territory, choosing to believe in something when it's not so, because it is able to change its believes much easier than reality, it already commits a major failure of rationality. A symbol "apple" in internal representation, an apple-picture formed on the video sensors, and an apple itself are different steps and they need to be distinguished. If I say "eat the apple", I mean an action performed with apple, not "apple" or apple-picture. If AI can mistake the goal of (e.g.) [eating an apple] for a goal of [eating an "apple"] or [eating an apple-picture], it is a huge enough error to stop it from working entirely. If it can turn to increasing the value on utility-indicator instead of increasing the value of utility, it looks like an obvious next step to just change the way it reads utility-indicator without affecting indicator itself, etc. I don't see why initially successful AI needs to suddenly set on a path to total failure of rationality. Utilities are not external *forces* coercing AI into behaving in a certain way, which it can try to override. The real utility *describes* the behavior of AI as a whole. Stability of AI's goal structure requires it to be able to recreate its own implementation from ground up, based on its beliefs about how it should behave. -- Vladimir Nesov [EMAIL PROTECTED] ------------------------------------------- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244&id_secret=103754539-40ed26 Powered by Listbox: http://www.listbox.com
