Brian Atkins wrote:
I'd like to do a small data gathering project regarding producing a Might-Be-Friendly AI (MBFAI). In other words, for whatever reason (don't want to go into it again in this thread), we assume 100% provability is out of the question for now, so we take one step back and then the decision is either produce something with less than 100% chance of success or hold off and don't make anything until we can do better.

So two obvious questions arise. (1) What lower-than-100% likelihood of success is acceptable at a very minimum? (2) How to concretely derive that percentage before proceeding to launch?

I'd like to gather ideas of answers to the first question, in order to help the researchers focus on roughly what minimal levels of performance the community-at-large feels they ought to be aiming for.

If you'd like to participate, please email me _offlist_ at [EMAIL PROTECTED] with your _lowest acceptable_ likelihood of success percentage, where the percentage represents:

   Your lowest acceptable chance that the MBFAI will avoid running amok in
   one 24 hour day.

So if you choose for example 99%, then that would mean you want there to be at a minimum only a 1% chance the MBFAI would run amok during each day it exists. "Run amok" is left undefined, but is understood to basically mean it goes off and does something "bad" other than what we were hoping it might do. If you like, include a short explanation of how you came up with your number, and if you prefer I do not release your name along with the answers.

I'll post the average, high/low range, etc. later.

Brian,

As you might have gathered from the ruckus on SL4 recently, one of the things I have been doing in the last couple of weeks is writing a paper on the Friendliness Problem. As soon as this is ready I will publish it on my site, and announce it.

This is very much relevant to your question, but in the absence of the finished version of that paper, I would like to point out that it may well be a matter of designing an AGI in such a way that it can be almost certain to not go bad, but where the reasoning behind this "almost certain" assessment is not a matter of mathematical proof.

By way of analogy, consider the problem of pigs spontaneouslygrowing wings and flying. We cannot exactly prove that this is not guaranteed to happen, nor can we easily generate a probability for its unlikelihood. But we can know from the genetic makeup of pigs and the occurrences of pigs in the past that the probability of some kind of genetic freak pig that had been born with hidden wings under the skin, which suddenly one day released those wings and took off, is so low that we would be insane to worry about it.

Now, my argument is going to be that if we choose a particular sort of design for an AGI (and avoid certain dangerous designs), we can know from the structure of it that it will have no tendency to depart from friendliness. Further, we can know that because its motives start out with friendliness as an important goal, it will take careful steps to ensure that no dangerous designs are tried out in the future - it will neutralize all other AGIs that are "badly" designed, and then ensure that anything else that it created after that used the same safe design.

With this situation, the AGI (and all subsequent AGIs created by our civilization) would have the same probability of recidivism as, say the probability that a pig will go airborne. We won't be able to guarantee anything, but it will clearly not be a source of worry.

More later when I get the paper finished.

Richard Loosemore.














-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/[EMAIL PROTECTED]

Reply via email to