> Your intuitions say... I am trying to summarize my impression of your > viewpoint, please feel free to correct me... "AI morality is a matter of > experiential learning, not just for the AI, but for the programmers. To > teach an AI morality you must give it the right feedback on moral > questions and reinforce the right behaviors... and you must also learn > *about* the deep issues of AI morality by raising a young AI. It isn't > pragmatically realistic to work out elaborate theories of AI morality in > advance; you must learn what you need to know as you go along. Moreover, > learning what you need to know, as you go along, is a good strategy for > creating a superintelligence... or at least, the rational estimate of the > goodness of that strategy is sufficient to make it a good idea to try and > create a superintelligence, and there aren't any realistic > strategies that > are better. An informal, intuitive theory of AI morality is good enough > to spark experiential learning in the *programmer* that carries you all > the way to the finish line. You'll learn what you need to know as you go > along. The most fundamental theoretical and design challenge is > making AI > happen, at all; that's the really difficult part that's defeated everyone > else so far. Focus on making AI happen. If you can make AI happen, > you'll learn how to create moral AI from the experience."
Hmmm. This is almost a good summary of my perspective, but you've still not come to grips with the extent of my uncertainty ;) I am not at all SURE that "An informal, intuitive theory of AI morality is good enough to spark experiential learning in the *programmer* that carries you all the way to the finish line." where by the "finish line" you mean an AGI whose ongoing evolution will lead to beneficial effects for both humans and AGI's. I'm open to the possibility that it may someday become clear, as AGI work progresses, that a systematic theory of AGI morality is necessary in order to proceed safely. But I suspect that, in order for me to feel that such a theory was necessary, I'd have to understand considerably more about AGI than I do right now. And I suspect that the only way I'm going to come to understand considerably more about AGI, is through experimentation with AGI systems. (This is where my views differ from Shane's; he is more bullish on the possibility of learning a lot about AGI through mathematical theory. I think this will happen, but I think the math theory will only get really useful when it is evolving in unison with practical AGI work.) Right now, it is not clear to me that a systematic theory of AGI morality is necessary in order to proceed safely. And it is also not clear to me that a systematic theory of AGI morality is possible to formulate based on our current state of knowledge about AGI. > In contrast, I felt that it was a good idea to develop a theory of AI > morality in advance, and have developed this theory to the point where it > currently predicts, counter to my initial intuitions and to my > considerable dismay: > > 1) AI morality is an extremely deep and nonobvious challenge > which has no > significant probability of going right by accident. I agree it's a deep and nonobvious challenge. You've done a great job of demonstrating that. I don't agree that any of your published writings have shown it "has no significant probability of going right by accident." > 2) If you get the deep theory wrong, there is a strong possibility of a > silent catastrophic failure: the AI appears to be learning > everything just > fine, and both you and the AI are apparently making all kinds of > fascinating discoveries about AI morality, and everything seems to be > going pretty much like your intuitions predict above, but when the AI > crosses the cognitive threshold of superintelligence it takes actions > which wipe out the human species as a side effect. Clearly this could happen, but I haven't read anything in your writings leading to even a heuristic, intuitive probability estimate for the outcome. > AIXI, which is a completely defined formal system, definitely undergoes a > failure of exactly this type. > > Ben, you need to be able to spot this. Think of it as a practice run for > building a real transhuman AI. If you can't spot the critical structural > property of AIXI's foundations that causes AIXI to undergo silent > catastrophic failure, then a real-world reprise of that situation with > Novamente would mean you don't have the deep theory to choose good > foundations deliberately, you can't spot bad foundations deductively, and > because the problems only show up when the AI reaches superintelligence, > you won't get experiential feedback on the failure of your theory until > it's too late. Exploratory research on AI morality doesn't work for AIXI > - it doesn't even visibly fail. It *appears* to work until it's > too late. > If you don't spot the problem in advance, you lose. > > If I can demonstrate that your current strategy for AI development would > undergo silent catastrophic failure in AIXI - that your stated strategy, > practiced on AIXI, would wipe out the human species, and you didn't spot > it - will you acknowledge that as a "practice loss"? A practice loss > isn't the end of the world. I have one practice loss on my record too. > But when that happened I took it seriously; I changed my behavior as a > result. If you can't spot the silent failure in AIXI, would you then > *please* admit that your current strategy on AI morality development is > not adequate for building a transhuman AI? You don't have to > halt work on > Novamente, just accept that you're not ready to try and create a > transhuman AI *yet*. > > I can spot the problem in AIXI because I have practice looking for silent > failures, because I have an underlying theory that makes it immediately > obvious which useful properties are formally missing from AIXI, and > because I have a specific fleshed-out idea for how to create > moral systems > and I can see AIXI doesn't work that way. Is it really all that > implausible that you'd need to reach that point before being able to > create a transhuman Novamente? Is it really so implausible that AI > morality is difficult enough to require at least one completely dedicated > specialist? Eliezer, I have not thought very hard about AIXI/AIXItl and its implications. For better or for worse, I am typing these e-mails at about 90 words a minute, inbetween doing other things that are of higher short-term priority ;) What I have or have not "spotted" about AIXI/AIXItl doesn't mean very much to me. I don't have time this week to sit back for a few hours and think hard about the possible consequences of AIXI/AIXItl as a real AI system. For example, I have to leave the house in half an hour for a meeting related to some possible Novamente funding; then when I get home I have a paper on genetic regulatory network inference using Novamente to finish, etc. etc. I wish I had all day to think about theoretical AGI but it's not the case. Hopefully, if the practical stuff I'm doing now succeeds, in a couple years when Novamente is further along I WILL be in a position where I can focus 80% instead of 30% of my time on the pure AGI aspects. What I do or do not spot about AIXI or any other system in a few spare moments, doesn't say much about what I and the whole Novamente team would or would not spot about Novamente, which we understand far more deeply than I understand AIXI, and which we are focusing a lot of our time on. -- Ben G ------- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]