Joshua, Fortunately, this is not that hard to fix by abandoning the idea of a reward function and going back to a normal utility function... I am working on a paper on how to do that.
--Abram On Mon, Jul 5, 2010 at 9:43 AM, Joshua Fox <[email protected]> wrote: > Abram, > > Good point. But I am ignoring the implementation of the utility/reward > function , and treating it as a Platonic mathematical function of > world-state or observations which cannot be changed without reducing the > total utility/reward. You are quite right that when we do bring > implementation into account, as one must in the real world, > the implementation (e.g., the person you mentioned) can be gamed. > > Even the pure mathematical function, however, can be gamed if you can alter > its inputs "unfairly", as in the example I gave of altering observations to > optimize a function of the observations. > > Regards, > > Joshua > > On Sun, Jul 4, 2010 at 6:43 PM, Abram Demski <[email protected]>wrote: > >> Joshua, >> >> But couldn't it game the external utility function by taking actions which >> modify it? For example, if the suggestion is taken literally and you have a >> person deciding the reward at each moment, an AI would want to focus on >> making that person *think* the reward should be high, rather than focusing >> on actually doing well at whatever task it's set...and the two would tend to >> diverge greatly for more and more complex/difficult tasks, since these tend >> to be harder to judge. Furthermore, the AI would be very pleased to knock >> the human out of the loop and push its own buttons. Similar comments would >> apply to automated reward calculations. >> >> --Abram >> >> >> On Sun, Jul 4, 2010 at 4:40 AM, Joshua Fox <[email protected]> wrote: >> >>> Another point. I'm probably repeating the obvious, but perhaps this will >>> be useful to some. >>> >>> On the one hand, an agent could not game a Legg-like intelligence metric >>> by altering the utility function, even an internal one,, since the metric is >>> based on the function before any such change. >>> >>> On the other hand, since an internally-calculated utility function would >>> necessarily be a function of observations, rather than of actual world >>> state, it could be successfully gamed by altering observations. >>> >>> This latter objection does not apply to functions which are externally >>> calculated, whether known or unknown. >>> >>> Joshua >>> >>> >>> >>> On Fri, Jul 2, 2010 at 7:23 PM, Joshua Fox <[email protected]> wrote: >>> >>>> I found the answer as given by Legg, *Machine Superintelligence*, p. >>>> 72, copied below. A reward function is used to bypass potential difficulty >>>> in communicating a utility function to the agent. >>>> >>>> Joshua >>>> >>>> The existence of a goal raises the problem of how the agent knows what >>>> the >>>> goal is. One possibility would be for the goal to be known in advance >>>> and >>>> for this knowledge to be built into the agent. The problem with this is >>>> that >>>> it limits each agent to just one goal. We need to allow agents that are >>>> more >>>> flexible, specifically, we need to be able to inform the agent of what >>>> the goal >>>> is. For humans this is easily done using language. In general however, >>>> the >>>> possession of a suffciently high level of language is too strong an >>>> assumption >>>> to make about the agent. Indeed, even for something as intelligent as a >>>> dog >>>> or a cat, direct explanation is not very effective. >>>> >>>> Fortunately there is another possibility which is, in some sense, a >>>> blend of >>>> the above two. We define an additional communication channel with the >>>> sim- >>>> plest possible semantics: a signal that indicates how good the agent’s >>>> current >>>> situation is. We will call this signal the reward. The agent simply has >>>> to >>>> maximise the amount of reward it receives, which is a function of the >>>> goal. In >>>> a complex setting the agent might be rewarded for winning a game or >>>> solving >>>> a puzzle. If the agent is to succeed in its environment, that is, >>>> receive a lot of >>>> reward, it must learn about the structure of the environment and in >>>> particular >>>> what it needs to do in order to get reward. >>>> >>>> >>>> >>>> >>>> On Mon, Jun 28, 2010 at 1:32 AM, Ben Goertzel <[email protected]> wrote: >>>> >>>>> You can always build the utility function into the assumed universal >>>>> Turing machine underlying the definition of algorithmic information... >>>>> >>>>> I guess this will improve learning rate by some additive constant, in >>>>> the long run ;) >>>>> >>>>> ben >>>>> >>>>> On Sun, Jun 27, 2010 at 4:22 PM, Joshua Fox <[email protected]>wrote: >>>>> >>>>>> This has probably been discussed at length, so I will appreciate a >>>>>> reference on this: >>>>>> >>>>>> Why does Legg's definition of intelligence (following on Hutters' AIXI >>>>>> and related work) involve a reward function rather than a utility >>>>>> function? >>>>>> For this purpose, reward is a function of the word state/history which is >>>>>> unknown to the agent while a utility function is known to the agent. >>>>>> >>>>>> Even if we replace the former with the latter, we can still have a >>>>>> definition of intelligence that integrates optimization capacity over >>>>>> possible all utility functions. >>>>>> >>>>>> What is the real significance of the difference between the two types >>>>>> of functions here? >>>>>> >>>>>> Joshua >>>>>> *agi* | Archives <https://www.listbox.com/member/archive/303/=now> >>>>>> <https://www.listbox.com/member/archive/rss/303/> | >>>>>> Modify<https://www.listbox.com/member/?&>Your Subscription >>>>>> <http://www.listbox.com> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Ben Goertzel, PhD >>>>> CEO, Novamente LLC and Biomind LLC >>>>> CTO, Genescient Corp >>>>> Vice Chairman, Humanity+ >>>>> Advisor, Singularity University and Singularity Institute >>>>> External Research Professor, Xiamen University, China >>>>> [email protected] >>>>> >>>>> " >>>>> “When nothing seems to help, I go look at a stonecutter hammering away >>>>> at his rock, perhaps a hundred times without as much as a crack showing in >>>>> it. Yet at the hundred and first blow it will split in two, and I know it >>>>> was not that blow that did it, but all that had gone before.” >>>>> >>>>> *agi* | Archives <https://www.listbox.com/member/archive/303/=now> >>>>> <https://www.listbox.com/member/archive/rss/303/> | >>>>> Modify<https://www.listbox.com/member/?&>Your Subscription >>>>> <http://www.listbox.com> >>>>> >>>> >>>> >>> *agi* | Archives <https://www.listbox.com/member/archive/303/=now> >>> <https://www.listbox.com/member/archive/rss/303/> | >>> Modify<https://www.listbox.com/member/?&>Your Subscription >>> <http://www.listbox.com> >>> >> >> >> >> -- >> Abram Demski >> http://lo-tho.blogspot.com/ >> http://groups.google.com/group/one-logic >> *agi* | Archives <https://www.listbox.com/member/archive/303/=now> >> <https://www.listbox.com/member/archive/rss/303/> | >> Modify<https://www.listbox.com/member/?&>Your Subscription >> <http://www.listbox.com> >> > > *agi* | Archives <https://www.listbox.com/member/archive/303/=now> > <https://www.listbox.com/member/archive/rss/303/> | > Modify<https://www.listbox.com/member/?&>Your Subscription > <http://www.listbox.com> > -- Abram Demski http://lo-tho.blogspot.com/ http://groups.google.com/group/one-logic ------------------------------------------- agi Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: https://www.listbox.com/member/?member_id=8660244&id_secret=8660244-6e7fb59c Powered by Listbox: http://www.listbox.com
