Brian Atkins wrote:
I'd like to do a small data gathering project regarding producing a
Might-Be-Friendly AI (MBFAI). In other words, for whatever reason (don't
want to go into it again in this thread), we assume 100% provability is
out of the question for now, so we take one step back and then the
decision is either produce something with less than 100% chance of
success or hold off and don't make anything until we can do better.
So two obvious questions arise. (1) What lower-than-100% likelihood of
success is acceptable at a very minimum? (2) How to concretely derive
that percentage before proceeding to launch?
I'd like to gather ideas of answers to the first question, in order to
help the researchers focus on roughly what minimal levels of performance
the community-at-large feels they ought to be aiming for.
If you'd like to participate, please email me _offlist_ at
[EMAIL PROTECTED] with your _lowest acceptable_ likelihood of success
percentage, where the percentage represents:
Your lowest acceptable chance that the MBFAI will avoid running amok in
one 24 hour day.
So if you choose for example 99%, then that would mean you want there to
be at a minimum only a 1% chance the MBFAI would run amok during each
day it exists. "Run amok" is left undefined, but is understood to
basically mean it goes off and does something "bad" other than what we
were hoping it might do. If you like, include a short explanation of how
you came up with your number, and if you prefer I do not release your
name along with the answers.
I'll post the average, high/low range, etc. later.
Brian,
As you might have gathered from the ruckus on SL4 recently, one of the
things I have been doing in the last couple of weeks is writing a paper
on the Friendliness Problem. As soon as this is ready I will publish it
on my site, and announce it.
This is very much relevant to your question, but in the absence of the
finished version of that paper, I would like to point out that it may
well be a matter of designing an AGI in such a way that it can be almost
certain to not go bad, but where the reasoning behind this "almost
certain" assessment is not a matter of mathematical proof.
By way of analogy, consider the problem of pigs spontaneouslygrowing
wings and flying. We cannot exactly prove that this is not guaranteed
to happen, nor can we easily generate a probability for its
unlikelihood. But we can know from the genetic makeup of pigs and the
occurrences of pigs in the past that the probability of some kind of
genetic freak pig that had been born with hidden wings under the skin,
which suddenly one day released those wings and took off, is so low that
we would be insane to worry about it.
Now, my argument is going to be that if we choose a particular sort of
design for an AGI (and avoid certain dangerous designs), we can know
from the structure of it that it will have no tendency to depart from
friendliness. Further, we can know that because its motives start out
with friendliness as an important goal, it will take careful steps to
ensure that no dangerous designs are tried out in the future - it will
neutralize all other AGIs that are "badly" designed, and then ensure
that anything else that it created after that used the same safe design.
With this situation, the AGI (and all subsequent AGIs created by our
civilization) would have the same probability of recidivism as, say the
probability that a pig will go airborne. We won't be able to guarantee
anything, but it will clearly not be a source of worry.
More later when I get the paper finished.
Richard Loosemore.
-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/[EMAIL PROTECTED]