On Sun, Jan 27, 2013 at 4:43 PM, Piaget Modeler <[email protected]>wrote: The Reinforcement Learning (RL) you discuss is called external reinforcement. What happens when you move to intrinsic (internal) reinforcement, where the reward function arises from a robot solving it's own problems, forming its own world model, and existing and participating in the world. ------ It is amazing that a fundamental insight like this is still treated as if it were controversial. Part of the problem is that when people recognize that they have to explain how superficial analysis of digitized Input data is turned into insight and they need a souless answer they are attracted by the most mundane method that strips out most presumptions - like internal reasoning. Another reason that RL attracts computer guys is because it looks binary.
But when a computer is able to solve its own problems the effect of those solutions can act as reinforcement. But, even beyond that, since complex reasoning is feasible it does not have to be just a case of simple reinforcement, but of complex encouragement that comes from conditional projections of knowledge. So even though we cannot actually create an AGI program right now, we are encouraged by our insights and the very limited results that we can get now. Those "reinforcements" are not coming from external events. In the worse case this internal motivation can be delusional and that is one reason why external reinforcements can influence us so much. Jim Bromer On Sun, Jan 27, 2013 at 4:43 PM, Piaget Modeler <[email protected]>wrote: > > You're avoiding the question Aaron. The questinon I raised is an ethical > one, and you're answering a technical one. > > You answered, "How can we prevent robots from desiring things like freedom > or leisure or compensation?" > > I asked "What do we give robots when they ask for rights?" I mean, even > animals have rights (PETA). > Why shouldn't robots? > > The Reinforcement Learning (RL) you discuss is called external > reinforcement. What happens when you move > to intrinsic (internal) reinforcement, where the reward function arises > from a robot solving it's own problems, > forming its own world model, and existing and participating in the world. > When the model consists of millions > or billions of individual schemes (entities), how are you going to do > surgery to extract those entities dealing > with liberty, and justice, or fairness. And why would you want to? > > The real question is do you join (PETR - People for the Ethical Treatment > of Robots) or not? > Do you embrace robot slavery or not? And is some form of slavery the > solution to global economy? > > ~PM > > ------------------------------ > Date: Sun, 27 Jan 2013 13:01:39 -0600 > Subject: Re: [agi] Robots and Slavery > From: [email protected] > To: [email protected] > > What if you didn't program a robot to desire its various freedom or > leisure, > but instead, they became sentient, and decided on their own that they want > freedom, leisure, monetary compensation, and rights? > > > In the field of Reinforcement Learning, which studies how to implement > "wants" in software, there is a basic separation of every algorithm into > two pieces: the part that does the learning & choosing (the agent), and the > part that measures how well things are going (the reward function). The > agent is the dynamic/intelligent part, and the reward function is a static > function to be optimized. You can completely replace the reward function > with a different one, and if the agent is well designed, it will learn a > completely different set of behaviors to optimize the new reward function > within the exact same environment. ( > http://en.wikipedia.org/wiki/Reinforcement_learning) > > In our own brains, we have specialized areas that respond to certain types > of stimuli and generate reward signals which are distributed throughout the > brain. It is even possible to reshape a person's or animal's reward > function using an external signal to override or add to our natural wants. ( > http://en.wikipedia.org/wiki/Brain_stimulation_reward) > > Intelligence is completely separable from desire. Both the system we > intend to reverse engineer and the theory about how such systems work > agree. If our robots were to decide they wanted freedom, leisure, monetary > compensation, rights, or anything else we can think of, it would be because > the reward function we gave them included some sort of incentive to seek > those out. In other words, even if we didn't directly program them to want > those things, we necessarily did so indirectly in the process of shaping > the reward function. In either case, provided the structure of our programs > reflect the theory and keep these components separated (which does not mean > they can't interact or depend on each other's behavior, but rather means we > bothered to keep our design appropriately modular), we can redesign and > replace the reward function so that the robots no longer desire things we > don't want them to desire. > > ------------------------------------------- AGI Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424 Modify Your Subscription: https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657 Powered by Listbox: http://www.listbox.com
