On Thu, Aug 1, 2013 at 4:41 AM, Jim Bromer <[email protected]> wrote: > Matt: > > Probability is a mathematical model > > of belief, not of reality. And in general, brains estimate > > probabilities using the 1/t rule. This was confirmed with many > > experiments in classical conditioning and reinforcement learning in > > animals. > > Do you remember how this was confirmed? How can the belief of an animal be > confirmed? And how can a reported belief value of a human being be relied on?
In animal experiments in classical conditioning, the time to learn the association (the inverse of the learning rate) is proportional to the time interval, t, from the conditioned stimulus to the unconditioned stimulus. In reinforcement learning, it is proportional to the interval between the action and the reinforcement signal. There is no learning when t is negative (when the order of events is reversed). Well, not exactly. The function 1/t has a discontinuity at t = 0, but real functions must be bounded and continuous. It actually peaks at around 0.1 or 0.2 seconds. This was from a psychology textbook. Schwartz, Barry, and Daniel Reisberg (1991), Learning and Memory, New York: W. W. Norton and Company. > Here is a cross-test that popped into my head. Suppose that a more probable > reward is associated with a very difficult task and the less probable reward > is associated with a simple task. Are you telling me that an intelligent > animal won't first try a simpler task that has produced a reward less > frequently? My guess is that he will try the simpler task a number of times > before he would go after the difficult task if the difficulty was severe > enough. At what difference between the rate of past rewards does the > intelligent animal go for the difficult task before the easy task? Depends on how you measure difficulty. And why would you want to complicate the experiment this way? I'm talking about learning, which is the rate of change of belief, as measured by the rate of change of the probability estimate. You are going to make decisions based on expected reward, or reward times probability of receiving it. -- -- Matt Mahoney, [email protected] ------------------------------------------- AGI Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424 Modify Your Subscription: https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657 Powered by Listbox: http://www.listbox.com
