UCB (and hence UCT) would treat the following sequences of wins (1) and losses (0) the same:

01010101010101010101010101010101
00000000000000001111111111111111
11111111111111110000000000000000

Clearly, it would be better to favor the second sequence, because that move has done more for us lately. Because the tree is growing, the values of the moves are moving targets.

Has anyone done any work dealing with this phenomenon, e.g., somehow giving more weight to more recent playouts?

Peter Drake
http://www.lclark.edu/~drake/



_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to