No it would not. AI willk "press its own buttons" only if those buttons are
defined. In one sense you can say that Goedel's theorem is a proof of
friendliness as it means that there must always be one button that AI cannot
press.


  - Ian Parker

On 4 July 2010 16:43, Abram Demski <[email protected]> wrote:

> Joshua,
>
> But couldn't it game the external utility function by taking actions which
> modify it? For example, if the suggestion is taken literally and you have a
> person deciding the reward at each moment, an AI would want to focus on
> making that person *think* the reward should be high, rather than focusing
> on actually doing well at whatever task it's set...and the two would tend to
> diverge greatly for more and more complex/difficult tasks, since these tend
> to be harder to judge. Furthermore, the AI would be very pleased to knock
> the human out of the loop and push its own buttons. Similar comments would
> apply to automated reward calculations.
>
> --Abram
>
>
>
>   *agi* | Archives <https://www.listbox.com/member/archive/303/=now>
> <https://www.listbox.com/member/archive/rss/303/> | 
> Modify<https://www.listbox.com/member/?&;>Your Subscription
> <http://www.listbox.com>
>



-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Reply via email to