Personally, if I were to take the approach of a preprogrammed ethics,
I would define good in pseudo-evolutionary terms: a pattern/entity is
good if it has high survival value in the long term. Patterns that are
self-sustaining on their own are thus considered good, but patterns
that help sustain other patterns would be too, because they are a
high-utility part of a larger whole.

Actually, I *do* define good and ethics not only in evolutionary terms but as being driven by evolution. Unlike most people, I believe that ethics is *entirely* driven by what is best evolutionarily while not believing at all in "red in tooth and claw". I can give you a reading list that shows that the latter view is horribly outdated among people who keep up with the research rather than just rehashing tired old ideas.

Actually, that idea is what made me assert that any goal produces
normalizing subgoals. Survivability helps achieve any goal, as long as
it isn't a time-bounded goal (finishing a set task).

Ah, I'm starting to get an idea of what you mean behind normalizing subgoals . . . . Yes, absolutely except that I contend that there is exactly one normalizing subgoal (though some might phrase it as two) that is normally common to virtually every goal (except in very extreme/unusual circumstances).


----- Original Message ----- From: "Abram Demski" <[EMAIL PROTECTED]>
To: <agi@v2.listbox.com>
Sent: Thursday, August 28, 2008 4:04 PM
Subject: **SPAM** Re: AGI goals (was Re: Information theoretic approaches to AGI (was Re: [agi] The Necessity of Embodiment))


Mark,

I still think your definitions still sound difficult to implement,
although not nearly as hard as "make humans happy without modifying
them". How would you define "consent"? You'd need a definition of
decision-making entity, right?

Personally, if I were to take the approach of a preprogrammed ethics,
I would define good in pseudo-evolutionary terms: a pattern/entity is
good if it has high survival value in the long term. Patterns that are
self-sustaining on their own are thus considered good, but patterns
that help sustain other patterns would be too, because they are a
high-utility part of a larger whole.

Actually, that idea is what made me assert that any goal produces
normalizing subgoals. Survivability helps achieve any goal, as long as
it isn't a time-bounded goal (finishing a set task).

--Abram

On Thu, Aug 28, 2008 at 2:52 PM, Mark Waser <[EMAIL PROTECTED]> wrote:
However, it
doesn't seem right to me to preprogram an AGI with a set ethical
theory; the theory could be wrong, no matter how good it sounds.

Why not wait until a theory is derived before making this decision?

Wouldn't such a theory be a good starting point, at least?

better to put such ideas in only as probabilistic correlations (or
"virtual evidence"), and let the system change its beliefs based on
accumulated evidence. I do not think this is overly risky, because
whatever the system comes to believe, its high-level goal will tend to
create normalizing subgoals that will regularize its behavior.

You're getting into implementation here but I will make a couple of personal
belief statements:

1. Probabilistic correlations are much, *much* more problematical than most
people are event willing to think about.  They work well with very simple
examples but they do not scale well at all.  Particularly problematic for
such correlations is the fact that ethical concepts are generally made up
*many* interwoven parts and are very fuzzy.  The church of Bayes does not
cut it for any work where the language/terms/concepts are not perfectly
crisp, clear, and logically correct.
2.  Statements like "its high-level goal will tend to create normalizing
subgoals that will regularize its behavior" sweep *a lot* of detail under
the rug.  It's possible that it is true.  I think that it is much more
probable that it is very frequently not true.  Unless you do *a lot* of
specification, I'm afraid that expecting this to be true is *very* risky.

I'll stick to my point about defining "make humans happy" being hard,
though. Especially with the restriction "without modifying them" that
you used.

I think that defining "make humans happy" is impossible -- but that's OK
because I think that it's a really bad goal to try to implement.

All I need to do is to define learn, harm, and help. Help could be defined as anything which is agreed to with informed consent by the affected subject
both before and after the fact.  Yes, that doesn't cover all actions but
that just means that the AI doesn't necessarily have a strong inclination
towards those actions. Harm could be defined as anything which is disagreed
with (or is expected to be disagreed with) by the affected subject either
before or after the fact.  Friendliness then turns into something like
asking permission.  Yes, the Friendly entity won't save you in many
circumstances, but it's not likely to kill you either.

<< Of course, I could also come up with the counter-argument to my own
thesis that the AI will never do anything because there will always be
someone who objects to the AI doing *anything* to change the world.-- but
that's just the absurdity and self-defeating arguments that I expect from
many of the list denizens that can't be defended against except by
allocating far more time than it's worth.>>



----- Original Message ----- From: "Abram Demski" <[EMAIL PROTECTED]>
To: <agi@v2.listbox.com>
Sent: Thursday, August 28, 2008 1:59 PM
Subject: **SPAM** Re: AGI goals (was Re: Information theoretic approaches to
AGI (was Re: [agi] The Necessity of Embodiment))


Mark,

Actually I am sympathetic with this idea. I do think good can be
defined. And, I think it can be a simple definition. However, it
doesn't seem right to me to preprogram an AGI with a set ethical
theory; the theory could be wrong, no matter how good it sounds. So,
better to put such ideas in only as probabilistic correlations (or
"virtual evidence"), and let the system change its beliefs based on
accumulated evidence. I do not think this is overly risky, because
whatever the system comes to believe, its high-level goal will tend to
create normalizing subgoals that will regularize its behavior.

I'll stick to my point about defining "make humans happy" being hard,
though. Especially with the restriction "without modifying them" that
you used.

On Thu, Aug 28, 2008 at 12:38 PM, Mark Waser <[EMAIL PROTECTED]> wrote:

Also, I should mention that the whole construction becomes irrelevant
if we can logically describe the goal ahead of time. With the "make
humans happy" example, something like my construction would be useful
if we need to AI to *learn* what a human is and what happy is. (We
then set up the pleasure in a way that would help the AI attach
"goodness" to the right things.) If we are able to write out the
definitions ahead of time, we can directly specify what goodness is
instead. But, I think it is unrealistic to take that approach, since
the definitions would be large and difficult....

:-) I strongly disagree with you. Why do you believe that having a new
AI
learn large and difficult definitions is going to be easier and safer
than
specifying them (assuming that the specifications can be grounded in the
AI's terms)?

I also disagree that the definitions are going to be as large as people
believe them to be . . . .

Let's take the Mandelbroit set as an example. It is perfectly specified
by
one *very* small formula. Yet, if you don't know that formula, you could spend many lifetimes characterizing it (particularly if you're trying to
doing it from multiple blurred and shifted  images :-).

The true problem is that humans can't (yet) agree on what goodness is --
 and
then they get lost arguing over detailed cases instead of focusing on the
core.

The core definition of goodness/morality and developing a system to
determine what actions are good and what actions are not is a project
that
I've been working on for quite some time and I *think* I'm making rather
good headway.


----- Original Message ----- From: "Abram Demski" <[EMAIL PROTECTED]>
To: <agi@v2.listbox.com>
Sent: Thursday, August 28, 2008 9:57 AM
Subject: **SPAM** Re: AGI goals (was Re: Information theoretic approaches
to
AGI (was Re: [agi] The Necessity of Embodiment))


Hi mark,

I think the miscommunication is relatively simple...

On Wed, Aug 27, 2008 at 10:14 PM, Mark Waser <[EMAIL PROTECTED]>
wrote:

Hi,

 I think that I'm missing some of your points . . . .

Whatever good is, it cannot be something directly
observable, or the AI will just wirehead itself (assuming it gets
intelligent enough to do so, of course).

I don't understand this unless you mean by "directly observable" that
the
definition is observable and changeable.  If I define good as making
all
humans happy without modifying them, how would the AI wirehead itself?
What
am I missing here?

When I say "directly observable", I mean observable-by-sensation.
"Making all humans happy" could not be directly observed unless the AI
had sensors in the pleasure centers of all humans (in which case it
would want to wirehead us). "Without modifying them" couldn't be
directly observed even then. So, realistically, such a goal needs to
be inferred from sensory data.

Also, I should mention that the whole construction becomes irrelevant
if we can logically describe the goal ahead of time. With the "make
humans happy" example, something like my construction would be useful
if we need to AI to *learn* what a human is and what happy is. (We
then set up the pleasure in a way that would help the AI attach
"goodness" to the right things.) If we are able to write out the
definitions ahead of time, we can directly specify what goodness is
instead. But, I think it is unrealistic to take that approach, since
the definitions would be large and difficult....


So, the AI needs to have a concept of external goodness, with a weak
probabilistic correlation to its directly observable pleasure.

I agree with the concept of external goodness but why does the
correlation
between external goodness and it's pleasure have to be low? Why can't
external goodness directly cause pleasure?  Clearly, it shouldn't
believe
that it's pleasure causes external goodness (that would be reversing
cause
and effect and an obvious logic error).

The correlation needs to be fairly low to allow the concept of good to
eventually split off of the concept of pleasure in the AI mind. The
external goodness can't directly cause pleasure because it isn't
directly detectable. Detection of goodness *through* inference *could*
be taken to cause pleasure; but this wouldn't be much use, because the
AI is already supposed to be maximizing goodness, not pleasure.
Pleasure merely plays the role of offering "hints" about what things
in the world might be good.

Actually, I think the proper probabilistic construction might be a bit
different than simply a "weak correlation"... for one thing, the
probability that goodness causes pleasure shouldn't be set ahead of
time. I'm thinking that likelihood would be more appropriate than
probability... so that it is as if the AI is born with some evidence
for the correlation that it cannot remember, but uses in reasoning (if
you are familiar with the idea of "virtual evidence" that is what I am
talking about).


 Mark

P.S.  I notice that several others answered your wirehead query so I
won't
belabor the point.  :-)


----- Original Message ----- From: "Abram Demski"
<[EMAIL PROTECTED]>
To: <agi@v2.listbox.com>
Sent: Wednesday, August 27, 2008 3:43 PM
Subject: **SPAM** Re: AGI goals (was Re: Information theoretic
approaches
to
AGI (was Re: [agi] The Necessity of Embodiment))


Mark,

The main motivation behind my setup was to avoid the wirehead
scenario. That is why I make the explicit goodness/pleasure
distinction. Whatever good is, it cannot be something directly
observable, or the AI will just wirehead itself (assuming it gets
intelligent enough to do so, of course). But, goodness cannot be
completely unobservable, or the AI will have no idea what it should
do.

So, the AI needs to have a concept of external goodness, with a weak
probabilistic correlation to its directly observable pleasure. That
way, the system will go after pleasant things, but won't be able to
fool itself with things that are maximally pleasant. For example, if
it were to consider rewiring its visual circuits to see only
skin-color, it would not like the idea, because it would know that
such a move would make it less able to maximize goodness in general.
(It would know that seeing only tan does not mean that the entire
world is made of pure goodness.) An AI that was trying to maximize
pleasure would see nothing wrong with self-stimulation of this sort.

So, I think that pushing the problem of goal-setting back to
pleasure-setting is very useful for avoiding certain types of
undesirable behavior.

By the way, where does this term "wireheading" come from? I assume
from context that it simply means self-stimulation.

-Abram Demski

On Wed, Aug 27, 2008 at 2:58 PM, Mark Waser <[EMAIL PROTECTED]>
wrote:

Hi,

 A number of problems unfortunately . . . .

-Learning is pleasurable.

. . . . for humans. We can choose whether to make it so for machines
or
not.  Doing so would be equivalent to setting a goal of learning.

-Other things may be pleasurable depending on what we initially want
the AI to enjoy doing.

 See . . . all you've done here is pushed goal-setting to
pleasure-setting
. . . .

= = = = =

 Further, if you judge goodness by pleasure, you'll probably create
an
AGI
whose shortest path-to-goal is to wirehead the universe (which I
consider
to
be a seriously suboptimal situation - YMMV).




----- Original Message ----- From: "Abram Demski"
<[EMAIL PROTECTED]>
To: <agi@v2.listbox.com>
Sent: Wednesday, August 27, 2008 2:25 PM
Subject: **SPAM** Re: AGI goals (was Re: Information theoretic
approaches
to
AGI (was Re: [agi] The Necessity of Embodiment))


Mark,

OK, I take up the challenge. Here is a different set of goal-axioms:

-"Good" is a property of some entities.
-Maximize good in the world.
-A more-good entity is usually more likely to cause goodness than a
less-good entity.
-A more-good entity is often more likely to cause pleasure than a
less-good entity.
-"Self" is the entity that causes my actions.
-An entity with properties similar to "self" is more likely to be
good.

Pleasure, unlike goodness, is directly observable. It comes from
many
sources. For example:
-Learning is pleasurable.
-A full battery is pleasurable (if relevant).
-Perhaps the color of human skin is pleasurable in and of itself.
(More specifically, all skin colors of any existing race.)
-Perhaps also the sound of a human voice is pleasurable.
-Other things may be pleasurable depending on what we initially want
the AI to enjoy doing.

So, the definition if "good" is highly probabilistic, and the
system's
inferences about goodness will depend on its experiences; but
pleasure
can be directly observed, and the pleasure-mechanisms remain fixed.

On Wed, Aug 27, 2008 at 12:32 PM, Mark Waser <[EMAIL PROTECTED]>
wrote:

But, how does your description not correspond to giving the AGI
the
goals of being helpful and not harmful? In other words, what more
does
it do than simply try for these? Does it pick goals randomly such
that
they conflict only minimally with these?

Actually, my description gave the AGI four goals: be helpful, don't
be
harmful, learn, and keep moving.

Learn, all by itself, is going to generate an infinite number of
subgoals.
Learning subgoals will be picked based upon what is most likely to
learn
the
most while not being harmful.

(and, by the way, be helpful and learn should both generate a
self-protection sub-goal in short order with procreation following
immediately behind)

Arguably, be helpful would generate all three of the other goals
but
learning and not being harmful without being helpful is a *much*
better
goal-set for a novice AI to prevent "accidents" when the AI thinks
it
is
being helpful.  In fact, I've been tempted at times to entirely
drop
the
be
helpful since the other two will eventually generate it with a
lessened
probability of trying-to-be-helpful accidents.

Don't be harmful by itself will just turn the AI off.

The trick is that there needs to be a balance between goals.  Any
single
goal intelligence is likely to be lethal even if that goal is to
help
humanity.

Learn, do no harm, help.  Can anyone come up with a better set of
goals?
(and, once again, note that learn does *not* override the other two
--
 there
is meant to be a balance between the three).

----- Original Message ----- From: "Abram Demski"
<[EMAIL PROTECTED]>
To: <agi@v2.listbox.com>
Sent: Wednesday, August 27, 2008 11:52 AM
Subject: **SPAM** Re: AGI goals (was Re: Information theoretic
approaches
to
AGI (was Re: [agi] The Necessity of Embodiment))


Mark,

I agree that we are mired 5 steps before that; after all, AGI is
not
"solved" yet, and it is awfully hard to design prefab concepts in
a
knowledge representation we know nothing about!

But, how does your description not correspond to giving the AGI
the
goals of being helpful and not harmful? In other words, what more
does
it do than simply try for these? Does it pick goals randomly such
that
they conflict only minimally with these?

--Abram

On Wed, Aug 27, 2008 at 11:09 AM, Mark Waser
<[EMAIL PROTECTED]>
wrote:

It is up to humans to define the goals of an AGI, so that it
will
do
what
we want it to do.

Why must we define the goals of an AGI? What would be wrong with
setting
it
off with strong incentives to be helpful, even stronger
incentives
to
not
be
harmful, and let it chart it's own course based upon the vagaries
of
the
world?  Let it's only hard-coded goal be to keep it's
satisfaction
above
a
certain level with helpful actions increasing satisfaction,
harmful
actions
heavily decreasing satisfaction; learning increasing
satisfaction,
and
satisfaction naturally decaying over time so as to promote action
.
.
.
.

Seems to me that humans are pretty much coded that way (with
evolution's
additional incentives of self-defense and procreation). The real
trick
of
the matter is defining helpful and harmful clearly but everyone
is
still
mired five steps before that.


----- Original Message -----
From: Matt Mahoney
To: agi@v2.listbox.com
Sent: Wednesday, August 27, 2008 10:52 AM
Subject: AGI goals (was Re: Information theoretic approaches to
AGI
(was
Re:
[agi] The Necessity of Embodiment))
An AGI will not design its goals. It is up to humans to define
the
goals
of
an AGI, so that it will do what we want it to do.

Unfortunately, this is a problem. We may or may not be successful
in
programming the goals of AGI to satisfy human goals. If we are
not
successful, then AGI will be useless at best and dangerous at
worst.
If
we
are successful, then we are doomed because human goals evolved in
a
primitive environment to maximize reproductive success and not in
an
environment where advanced technology can give us whatever we
want.
AGI
will
allow us to connect our brains to simulated worlds with magic
genies,
or
worse, allow us to directly reprogram our brains to alter our
memories,
goals, and thought processes. All rational goal-seeking agents
must
have
a
mental state of maximum utility where any thought or perception
would
be
unpleasant because it would result in a different state.

-- Matt Mahoney, [EMAIL PROTECTED]

----- Original Message ----
From: Valentina Poletti <[EMAIL PROTECTED]>
To: agi@v2.listbox.com
Sent: Tuesday, August 26, 2008 11:34:56 AM
Subject: Re: Information theoretic approaches to AGI (was Re:
[agi]
The
Necessity of Embodiment)

Thanks very much for the info. I found those articles very
interesting.
Actually though this is not quite what I had in mind with the
term
information-theoretic approach. I wasn't very specific, my bad.
What
I
am
looking for is a a theory behind the actual R itself. These
approaches
(correnct me if I'm wrong) give an r-function for granted and
work
from
that. In real life that is not the case though. What I'm looking
for
is
how
the AGI will create that function. Because the AGI is created by
humans,
some sort of direction will be given by the humans creating them.
What
kind
of direction, in mathematical terms, is my question. In other
words
I'm
looking for a way to mathematically define how the AGI will
mathematically
define its goals.

Valentina


On 8/23/08, Matt Mahoney <[EMAIL PROTECTED]> wrote:

Valentina Poletti <[EMAIL PROTECTED]> wrote:
> I was wondering why no-one had brought up the >
> information-theoretic
> aspect of this yet.

It has been studied. For example, Hutter proved that the optimal
strategy
of a rational goal seeking agent in an unknown computable
environment
is
AIXI: to guess that the environment is simulated by the shortest
program
consistent with observation so far [1]. Legg and Hutter also
propose
as
a
measure of universal intelligence the expected reward over a
Solomonoff
distribution of environments [2].

These have profound impacts on AGI design. First, AIXI is
(provably)
not
computable, which means there is no easy shortcut to AGI.
Second,
universal
intelligence is not computable because it requires testing in an
infinite
number of environments. Since there is no other well accepted
test
of
intelligence above human level, it casts doubt on the main
premise
of
the
singularity: that if humans can create agents with greater than
human
intelligence, then so can they.

Prediction is central to intelligence, as I argue in [3]. Legg
proved
in
[4] that there is no elegant theory of prediction. Predicting
all
environments up to a given level of Kolmogorov complexity
requires
a
predictor with at least the same level of complexity.
Furthermore,
above
a
small level of complexity, such predictors cannot be proven
because
of
Godel
incompleteness. Prediction must therefore be an experimental
science.

There is currently no software or mathematical model of
non-evolutionary
recursive self improvement, even for very restricted or simple
definitions
of intelligence. Without a model you don't have friendly AI; you
have
accelerated evolution with AIs competing for resources.

References

1. Hutter, Marcus (2003), "A Gentle Introduction to The
Universal
Algorithmic Agent {AIXI}",
in Artificial General Intelligence, B. Goertzel and C. Pennachin
eds.,
Springer. http://www.idsia.ch/~marcus/ai/aixigentle.htm

2. Legg, Shane, and Marcus Hutter (2006),
A Formal Measure of Machine Intelligence, Proc. Annual machine
learning conference of Belgium and The Netherlands
(Benelearn-2006).
Ghent, 2006.  http://www.vetta.org/documents/ui_benelearn.pdf

3. http://cs.fit.edu/~mmahoney/compression/rationale.html

4. Legg, Shane, (2006), Is There an Elegant Universal Theory of
Prediction?,
Technical Report IDSIA-12-06, IDSIA / USI-SUPSI,
Dalle Molle Institute for Artificial Intelligence, Galleria 2,
6928
Manno,
Switzerland.
http://www.vetta.org/documents/IDSIA-12-06-1.pdf

-- Matt Mahoney, [EMAIL PROTECTED]


-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: https://www.listbox.com/member/?&;
Powered by Listbox: http://www.listbox.com



--
A true friend stabs you in the front. - O. Wilde

Einstein once thought he was wrong; then he discovered he was
wrong.

For every complex problem, there is an answer which is short,
simple
and
wrong. - H.L. Mencken
________________________________
agi | Archives | Modify Your Subscription
________________________________
agi | Archives | Modify Your Subscription

________________________________
agi | Archives | Modify Your Subscription


-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: https://www.listbox.com/member/?&;
Powered by Listbox: http://www.listbox.com





-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription:
https://www.listbox.com/member/?&;
Powered by Listbox: http://www.listbox.com



-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: https://www.listbox.com/member/?&;
Powered by Listbox: http://www.listbox.com





-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription:
https://www.listbox.com/member/?&;
Powered by Listbox: http://www.listbox.com



-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: https://www.listbox.com/member/?&;
Powered by Listbox: http://www.listbox.com





-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription:
https://www.listbox.com/member/?&;
Powered by Listbox: http://www.listbox.com



-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: https://www.listbox.com/member/?&;
Powered by Listbox: http://www.listbox.com





-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription:
https://www.listbox.com/member/?&;
Powered by Listbox: http://www.listbox.com



-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: https://www.listbox.com/member/?&;
Powered by Listbox: http://www.listbox.com





-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription:
https://www.listbox.com/member/?&;
Powered by Listbox: http://www.listbox.com



-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: https://www.listbox.com/member/?&;
Powered by Listbox: http://www.listbox.com





-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com

Reply via email to