On Sat, 15 Nov 2014, Stanley Nilsen via AGI wrote:
1. Hutter is credited with the Universal AI concept involving maximizing reward. But, how does one determine what is "rewarding" and to what extent one reward is better than another?
For Universal AI and for reinforcement learning in general, the reward comes from the environment. In this view, the environment defines a problem and an agent that gets high rewards is a good solution.
2. I see the idea that one use observation to populate environment. (ch 3 pg 22) But, observation is tied closely with this idea of reward, and that is the "ethical" dilemma - who and what defines reward?
Rewards from the environment pose two serious problems: Hutter described how an agent may try to corrupt the source of reward, and Ring and Orseau showed how an agent will delude itself about its rewards (akin to addictive drugs). But rewards don't have to come from the environment - they can be part of the agent definition. Much ethical AI research is about how to define an agent's motivations (utility function or logical goal) to avoid behavior that will be harmful to humans. So the proper answer to "who and what defines reward?" is that the who is the agent designer, and the what (or how) is a difficult problem.
3. In chapter two the environment is a given. There is a hint that the environment is not simple... "In a practical AI system, like a self-driving car, observations and actions are complex computer data structures, and the probability distribution ρ (h) is expressed in a massive database and millions of lines of code." Yet, we expect an AI to observe and learn this environment by it's own observation? (I see that chapter 5 deals with this...)
Yes. Hutter's Universal AI describes one (uncomputable) way to do it. But this amazing system: http://arxiv.org/abs/1312.5602 demonstrates that it can be done in reality (it learns the value functon v(ha) rather than the model \rho(h), but that's just as difficult). Bill ------------------------------------------- AGI Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424 Modify Your Subscription: https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657 Powered by Listbox: http://www.listbox.com
