Re: [agi] Book draft: Ethical Artificial Intelligence

Stanley Nilsen via AGI Sat, 15 Nov 2014 07:53:44 -0800

I think I'm beginning to see my confusion which stems from maybe moreabstraction than I can process. Thanks Bill for your patience. If I may...

1. Hutter is credited with the Universal AI concept involving maximizingreward. But, how does one determine what is "rewarding" and to whatextent one reward is better than another?

2. I see the idea that one use observation to populate environment. (ch3 pg 22) But, observation is tied closely with this idea of reward, andthat is the "ethical" dilemma - who and what defines reward?

3. In chapter two the environment is a given. There is a hint that theenvironment is not simple... "In a practical AI system, like aself-driving car, observations and actions are complex computer datastructures, and the probability distribution ρ (h) is expressed in amassive database and millions of lines of code." Yet, we expect an AIto observe and learn this environment by it's own observation? (I seethat chapter 5 deals with this...)

Let me throw up the white flag - I can see that chapter 5 is one that Ineed to study more closely to get closer to the ethical questions I have.

I am curious though at how one determines what constitutes "reward."Part of my interest in watching AI is because I want to know what thesmartest "unit" in the world discovers, or considers, to be the ultimatereward / value.


Thanks again for patience and the cycles spent.
Stan


On 11/14/2014 02:23 PM, Bill Hibbard via AGI wrote:

On Fri, 14 Nov 2014, Stanley Nilsen via AGI wrote:

Okay, I'll see if I can grasp 2.3 and 2.4.
Perhaps you can lessen my pain by telling me which equations addressa risk factor?


Sorry to hear this is causing you pain.

Risk is in equaton (2.4):

v(ha) = \sum_{o \in O} \rho(o | ha) v(hao)

In English, this says that the value of an action
a after histor h, denoted v(ha), is von Neumann
and Morgenstern's lottery of possible outcomes
from that action. The possible outcomes are the
hao, for different observations o \in O. Each
outcome hao has value v(hao) and probability
\rho(o | ha).

Risk comes in because some outcomes may have very
low value v(hao). Those values are multiplied by
the probability of the outcome, denoted
\rho(o | ha). The sum adds up the good outcomes
(high v(hao)) and the bad outcomes (low v(hao)),
multiplied by their probabilities, so get an
expected value v(ha) of the action a.

So the sum is balancing risk (low values v(hao))
against reward (high values v(hao)). Then
equations (2.3) and (2.5) choose the action that
maximizes expected value.

Cheers,
Bill


-------------------------------------------
AGI
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/9320387-ea529a81

Modify Your Subscription:https://www.listbox.com/member/?&;

Powered by Listbox: http://www.listbox.com




-------------------------------------------
AGI
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657
Powered by Listbox: http://www.listbox.com

Re: [agi] Book draft: Ethical Artificial Intelligence

Reply via email to