Re: [Pyro-users] Time Difference Learning with Conx

belinda thom Mon, 08 Jan 2007 19:09:51 -0800

I'm doing something similar right now (although I'm not at presentusing conx).

I used the algorithm Tom Mitchell suggests at the end of his 1stchapter in Machine Learning (a textbook).

When you're assuming a linear activation function, and I don'tbelieve this method is any different for non-linear cases.

Updates are easily done for _each_ play in the game as follows: yourcurrent training estimate of the value of a state is compared to thevalue the function estimates for the "best" (in a minimax sense) nextstate (when the player will next play). Check out his text, itspretty clear.


HTH,
--b

On Jan 8, 2007, at 6:40 PM, Chris S wrote:

I'm trying to use Conx to train a couple networks to aid in a board
game engine, and I'd like to get some feedback on my strategy. The
networks will be used in a minimax search to prune the state tree and
evaluate the player's position. Both networks will take the board as
input, with a layer for each position on the board. One network, the
move suggester, will output a layer for each action, indicating how
much performing that action will increase the player's score. The
second network, the score estimator, will output estimates of each
player's score.

My question is what is the best way to train these networks? My
current strategy is to do nothing until the game is over. I'll use a
static algorithm to reliably score the end game state. Then I'll
create a training corpus by taking the score and iterating through
each move in the game, creating training sets in the form of
[boardstate, finalscore]. For the move suggester, I think I'd have to
disable the output layers for the actions not selected.

This method seems fairly primitive and naive, but I've never done time
difference learning with neural networks before. Is there a better
way? Any suggestions are appreciated.

Regards,
Chris
_______________________________________________
Pyro-users mailing list
[email protected]
http://emergent.brynmawr.edu/mailman/listinfo/pyro-users


_______________________________________________
Pyro-users mailing list
[email protected]
http://emergent.brynmawr.edu/mailman/listinfo/pyro-users

Re: [Pyro-users] Time Difference Learning with Conx

Reply via email to