[Pyro-users] Time Difference Learning with Conx

Chris S Mon, 08 Jan 2007 18:41:12 -0800

I'm trying to use Conx to train a couple networks to aid in a board
game engine, and I'd like to get some feedback on my strategy. The
networks will be used in a minimax search to prune the state tree and
evaluate the player's position. Both networks will take the board as
input, with a layer for each position on the board. One network, the
move suggester, will output a layer for each action, indicating how
much performing that action will increase the player's score. The
second network, the score estimator, will output estimates of each
player's score.


My question is what is the best way to train these networks? My
current strategy is to do nothing until the game is over. I'll use a
static algorithm to reliably score the end game state. Then I'll
create a training corpus by taking the score and iterating through
each move in the game, creating training sets in the form of
[boardstate, finalscore]. For the move suggester, I think I'd have to
disable the output layers for the actions not selected.

This method seems fairly primitive and naive, but I've never done time
difference learning with neural networks before. Is there a better
way? Any suggestions are appreciated.

Regards,
Chris
_______________________________________________
Pyro-users mailing list
[email protected]
http://emergent.brynmawr.edu/mailman/listinfo/pyro-users

[Pyro-users] Time Difference Learning with Conx

Reply via email to