On 01/03/2014 07:40, Tim Tyler wrote:

Nor does it make very much theoretical difference whether the "goodness scalar"
involved in the "universal currency" comes from the environment directly, or is
synthesized internally from sensory data and current state using a "utility 
function".
Alas, an awful lot of hot air seems to surround this last point.

To illustrate, here's Ben from 2009:

http://multiverseaccordingtoben.blogspot.com/2009/05/reinforcement-learning-some-limitations.html

Part of the problem is terminology. However, it is very useful to have a general
theory of learning based on reward, utility - or whatever you want to call the
"goodness" metric. I feel frustrated with the critics; they don't seem to get 
it.

--
__________
 |im |yler  http://timtyler.org/  [email protected]  Remove lock to reply.




-------------------------------------------
AGI
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657
Powered by Listbox: http://www.listbox.com

Reply via email to