Re: [agi] "Reward" and "utility" are fundamentally the same

Matt Mahoney Mon, 03 Mar 2014 07:46:34 -0800

There are different kinds of reinforcement learning.

1. The AIXI model. The agent does not know the utility function and
must learn it. It assumes the simplest model that fits observation.


2. The MIRI model. A powerful agent lives in a complex environment
with a simple and well understood (but poorly designed) utility
function. It uses reasoning and thought experiments to predict which
actions will maximize future reward.

3. The animal model (including humans). A reward (or penalty) acts to
increase (or decrease) the frequency of behavior performed at time t
before the signal with effect proportional to 1/t.

4. The practical AI model. The AI has no goals. Instead, its behavior
is continually updated by the humans controlling it to meet the
complex and poorly understood goals of the humans.

-- 
-- Matt Mahoney, [email protected]


-------------------------------------------
AGI
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657
Powered by Listbox: http://www.listbox.com

Re: [agi] "Reward" and "utility" are fundamentally the same

Reply via email to