On Thu, Aug 1, 2013 at 4:41 AM, Jim Bromer <[email protected]> wrote:
> Matt:
> > Probability is a mathematical model
> > of belief, not of reality. And in general, brains estimate
> > probabilities using the 1/t rule. This was confirmed with many
> > experiments in classical conditioning and reinforcement learning in
> > animals.
>
> Do you remember how this was confirmed?  How can the belief of an animal be 
> confirmed?  And how can a reported belief value of a human being be relied on?

In animal experiments in classical conditioning, the time to learn the
association (the inverse of the learning rate) is proportional to the
time interval, t, from the conditioned stimulus to the unconditioned
stimulus. In reinforcement learning, it is proportional to the
interval between the action and the reinforcement signal. There is no
learning when t is negative (when the order of events is reversed).

Well, not exactly. The function 1/t has a discontinuity at t = 0, but
real functions must be bounded and continuous. It actually peaks at
around 0.1 or 0.2 seconds.

This was from a psychology textbook. Schwartz, Barry, and Daniel
Reisberg (1991), Learning and Memory, New York: W. W. Norton and
Company.

> Here is a cross-test that popped into my head.  Suppose that a more probable 
> reward is associated with a very difficult task and the less probable reward 
> is associated with a simple task.  Are you telling me that an intelligent 
> animal won't first try a simpler task that has produced a reward less 
> frequently?  My guess is that he will try the simpler task a number of times 
> before he would go after the difficult task if the difficulty was severe 
> enough.  At what difference between the rate of past rewards does the 
> intelligent animal go for the difficult task before the easy task?

Depends on how you measure difficulty. And why would you want to
complicate the experiment this way? I'm talking about learning, which
is the rate of change of belief, as measured by the rate of change of
the probability estimate. You are going to make decisions based on
expected reward, or reward times probability of receiving it.

-- 
-- Matt Mahoney, [email protected]


-------------------------------------------
AGI
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657
Powered by Listbox: http://www.listbox.com

Reply via email to