Ian, The reward button *would* be amoung the well-defined ones, though... sounds to me like you are just abusing Goedel's theorem. Can you give a more detailed argument?
--Abram On Sun, Jul 4, 2010 at 4:47 PM, Ian Parker <[email protected]> wrote: > > > No it would not. AI willk "press its own buttons" only if those buttons are > defined. In one sense you can say that Goedel's theorem is a proof of > friendliness as it means that there must always be one button that AI cannot > press. > > > - Ian Parker > > On 4 July 2010 16:43, Abram Demski <[email protected]> wrote: > >> Joshua, >> >> But couldn't it game the external utility function by taking actions which >> modify it? For example, if the suggestion is taken literally and you have a >> person deciding the reward at each moment, an AI would want to focus on >> making that person *think* the reward should be high, rather than focusing >> on actually doing well at whatever task it's set...and the two would tend to >> diverge greatly for more and more complex/difficult tasks, since these tend >> to be harder to judge. Furthermore, the AI would be very pleased to knock >> the human out of the loop and push its own buttons. Similar comments would >> apply to automated reward calculations. >> >> --Abram >> >> >> >> *agi* | Archives <https://www.listbox.com/member/archive/303/=now> >> <https://www.listbox.com/member/archive/rss/303/> | >> Modify<https://www.listbox.com/member/?&>Your Subscription >> <http://www.listbox.com> >> > > *agi* | Archives <https://www.listbox.com/member/archive/303/=now> > <https://www.listbox.com/member/archive/rss/303/> | > Modify<https://www.listbox.com/member/?&>Your Subscription > <http://www.listbox.com> > -- Abram Demski http://lo-tho.blogspot.com/ http://groups.google.com/group/one-logic ------------------------------------------- agi Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: https://www.listbox.com/member/?member_id=8660244&id_secret=8660244-6e7fb59c Powered by Listbox: http://www.listbox.com
