Bill Hibbard wrote:
Hey Eliezer, my name is Hibbard, not Hubbard.
*Argh* <sound of hand whapping forehead> sorry.

On Fri, 14 Feb 2003, Eliezer S. Yudkowsky wrote:

*takes deep breath*
This is probably the third time you've sent a message
to me over the past few months where you make some
remark like this to indicate that you are talking down
to me.
No, that's the sound of a lone overworked person taking on yet another simultaneous conversation. But I digress.

> But then you seem to chicken out on the exchange.
For example, this morning I pointed out the fallacy in
your AIXI argument and got no reply (you were assuming
that humans have some way of knowing, rather than just
estimating, the intention of other minds).
Okay.  I shall reply to that as well.

You're thinking of the logical entailment approach and the problem with
that, as it appears to you, is that no simple set of built-in principles
can entail everything the SI needs to know about ethics - right?
Yes. Laws (logical constraints) are inevitably ambiguous.
Does that include the logical constraints governing the reinforcement process itself?

Like, the complexity of everything the SI needs to do is some very high
quantity, while the complexity of the principles that are supposed to
entail it is small, right?
As wonderfully demonstrated by Eric Baum's papers, complex
behaviors are learned via simple values.
*Some* complex behaviors can be learned via *some* simple values. The question is understanding *which* simple values result in the learning of which complex behaviors; for example, Eric Baum's system had to be created with simple values that behave in a very precise way in order to achieve its current level of learning ability. That's why Eric Baum had to write the paper, instead of just saying "Aha, I can produce complex behaviors via simple values."

If SIs have behaviors that are reinforced by a set of values V, what is
the internal mechanism that an SI uses to determine the amount of V?
Let's say that the SI contains an internal model of the environment, which
I think is what you mean by temporal credit assignment, et cetera, and the
SI has some predicate P that applies to this internal model and predicts
the amount of "human happiness" that exists.  Or perhaps you weren't
thinking of a system that complex; perhaps you just want a predicate P
that applies to immediate sense data, like the human sense of pleasurable
tastes.

What is the complexity of the predicate P?

I mean, I'm sure it seems very straightforward to you to determine when
"human happiness" is occurring...
There are already people developing special AI programs to
recognize emotions in human facial expresssions and voices.
And emotional expressions in body language shouldn't be much
harder. I'm not claiming that these problems are totally
solved, just that there are much easier than AGI, and can
serve as reinforcement values for an AGI. The value V or
predicate P for reinforcement values are immediate, and
relatively simple. Reinforcement learning generates very
complex behaviors from these. Credit assignment, including
temporal credit assignment, is the problem of understanding
cause and effect relations between multiple behaviors and
future values.
Yes, reinforcement learning generates very complex behaviors from there. The question is *which* complex behaviors - whether you see all the complex behaviors you want to see, and none of the complex behaviors you don't want to see.

Can I take it from the above that you believe that AI morality can be created by reinforcing behaviors using a predicate P that acts on incoming video sensory information to recognize smiles and laughter and generate a reward signal? Is this adequate for a superintelligence too?

--
Eliezer S. Yudkowsky http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

-------
To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Reply via email to