Eliezer S. Yudkowsky wrote: > >> . . . > > Yes. Laws (logical constraints) are inevitably ambiguous. > > Does that include the logical constraints governing the reinforcement > process itself?
There is a logic of the reinforcement process, but it is a behavior rather than a constraint on a behavior. By ambiguity, I mean for example the ambiguity of Asimov's first law that says "A robot may not injure a human being, or, through inaction, allow a human being to come to harm." This is ambiguous in the situation where one human is harming another, and the law requires the robot to intervene for the victim but prohibits it from intervening against the attacker. > >>Like, the complexity of everything the SI needs to do is some very high > >>quantity, while the complexity of the principles that are supposed to > >>entail it is small, right? > > > > As wonderfully demonstrated by Eric Baum's papers, complex > > behaviors are learned via simple values. > > *Some* complex behaviors can be learned via *some* simple values. The > question is understanding *which* simple values result in the learning of > which complex behaviors; for example, Eric Baum's system had to be created > with simple values that behave in a very precise way in order to achieve > its current level of learning ability. That's why Eric Baum had to write > the paper, instead of just saying "Aha, I can produce complex behaviors > via simple values." Baum's algorithm is very carefully worked out, but the reinforcement values it learns from are simple. And a successful reinforcement learning algorithm is one that can work from any reinforcement values in any situation. > . . . > Yes, reinforcement learning generates very complex behaviors from there. > The question is *which* complex behaviors - whether you see all the > complex behaviors you want to see, and none of the complex behaviors you > don't want to see. > > Can I take it from the above that you believe that AI morality can be > created by reinforcing behaviors using a predicate P that acts on incoming > video sensory information to recognize smiles and laughter and generate a > reward signal? Is this adequate for a superintelligence too? The key for intelligence is a good reinforcement learning algorithm, that can work from any reinforcement values and efficiently learn behaviors that maximize those values. So the values can be simple, like recognizing smiles and laughter, and the learned behaviors can be complex, even to the point of making billions of humans happy. That qualifies as super-intelligent. Despite the wonderful work of Eric Baum and others, developing really robust reinforcement learning is a really hard challenge. Which is why my estimate for the arrival of SI is 2100 rather than 2010 or 2020. I hope I'm wrong, because I want to meet a SI. Bill ------- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
