[computer-go] On UCT Heuristics

Peter Drake Fri, 16 Mar 2007 08:06:37 -0800

I'm trying to come up with a general form (an interface, in Javaterms) for heuristics in UCT. As I see it, there are three phases ofeach MC run:

1) When choosing a move from a tree node from which all moves havebeen tried, use UCT and no heuristics.

2) When choosing a move from a tree node from which some moves areunexplored, heuristics may come into play. This case is vastly morecommon than case 1; almost all tree nodes are leaves.

3) When choosing a move beyond the tree, heuristics may come intoplay. This case is even more common, but heuristics here must be fastand nondeterministic.

I therefore suggest that a heuristic suggests a move given a boardconfiguration (including things like the location of the last move)and potentially a set of allowable moves. It is helpful to maintain aweighted set of heuristics.

When choosing a move for case 2 above, we consider each move in theallowed set (that is, the moves which have not already been exploredfrom this node). In a heuristic set, each move is given a value thatis the weighted sum of the values given by the individual heuristics.

In case 3, we choose one of the heuristics at random, with eachheuristic's weight being its probability of being chosen. The movesuggested by that heuristic, from among all legal moves, is chosen.


Heuristics that could be written in this framework include:

Play a capturing move
Play near the last move

Play the move that, when played at any point, has led to the highestproportion of wins in past MC runsPlay the move that, when played as a response to the previous move,has led to the highest proportion of wins in past MC runs

Play the move that matches the most patterns from some pattern database

There are, of course, many more. Is anyone using heuristics thatdon't fit into this framework?



Peter Drake
http://www.lclark.edu/~drake/



_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

[computer-go] On UCT Heuristics

Reply via email to