Chris Spencer wrote:
On 12/19/06, Douglas S. Blank <[EMAIL PROTECTED]> wrote:
Chris,
Yes, there is the beginning of some support in Conx for similar
techniques. You can at least use this as an example. Take a look at the
SigmaNetwork in pyrobot/brain/conx.py. It is an example of a type of
CRBP (complimentary reinforcement backprop). It works like this:
[snip]
Thanks for the great example. However, is it possible for SigmaNetwork
to train multiple distinct outputs? In your example, you have 11
output nodes attempting to approximate XOR. Suppose I wanted a network
that would approximate a policy function, requiring that each output
node represent a unique action. Am I right in thinking that
SigmaNetwork would be unable to perform this task, since it requires
that all the output nodes approximate a single target?
The way that it is written, yes, you are right: the whole output layer
computes a single value. But, you should be able to adapt the source
code of the SigmaNetwork to make it so that a subset each computes just
one value, and the entire output layer can compute as many as you want
(with overlap, too, if you wished).
Also, are you aware of any studies comparing CRBP to TD-Lambda? It
would be interesting to know which performs better.
I don't know of any, but I would be suspicious of some general statement
about one always performing better than the other.
-Doug
Regards,
Chris
_______________________________________________
Pyro-users mailing list
[email protected]
http://emergent.brynmawr.edu/mailman/listinfo/pyro-users
_______________________________________________
Pyro-users mailing list
[email protected]
http://emergent.brynmawr.edu/mailman/listinfo/pyro-users