First, a general hypothesis on heuristics: one should apply heuristics to the first few moves beyond the fringe of the UCT tree, and not later. It's important that these early moves be good, but not worth the time to make later moves good. Thoughts? Is anyone already using this idea?

Now, a specific heuristic I'd like to try. If anyone can point out anything horribly wrong with it before I go to the trouble to implement it, that'd be nice. :-)

Maintain a set of if-then rules, perhaps 1000 of them. Each rule consists of a board configuration and a suggested move. (Originally, they're all identically [<empty board> => E5] for 9x9.) As the game progresses, this population of rules will change.

When it's time to heuristically choose a move, compare the current board configuration against the "if" part of each rule. Play the move from the closest match (nearest neighbor). There's room for creativity in the definition of nearness, but something like Hamming distance might suffice.

The population of rules is updated during the game. We might do this, for example, whenever a move becomes the best move from its UCT node. (Note that I'm using "best" here to mean "most likely to win" and not "highest UCT value".) When this happens, ask the population what it would do given this board configuration. If the answer is the move in question, do nothing. Otherwise, overwrite the oldest rule with a new rule suggesting this move for this configuration.

My hope is that this heuristic will suggest the move that has been most effective on similar boards.

Peter Drake
http://www.lclark.edu/~drake/



_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to