On 11/4/07, Yamato [EMAIL PROTECTED] wrote:
-From Re: So many MoGo run on cgos 9x9:
greenpeep also uses patterns derived from 2 UCT self-play games.
These are simple local patterns with scores that (roughly) indicate
the probability that the move at the center of the pattern was
selected by UCT during these games. These patterns are then used both
to bias moves at UCT nodes which have few visits, and also to bias the
playouts. What I've seen is:
- Biasing playouts by patterns is much better than unbiased playouts
- Playouts using self-play patterns together with MoGo-style move
preferences (favor defensive moves and captures, as well as local
moves biased by the self-play patterns, before resorting to a global
move biased by patterns) yield much better results than just using
the patterns by themselves globally.
I am interested in this improvemnt.
Do you have any data to compare the performance of biased playouts
with MoGo-style one? (the winning rate against GNU Go, etc)
Also, how large and how many are your patterns?
greenpeep's patterns include the local 3x3 neighborhood. In addition,
for each of the four nearest neighbors, it includes liberty count info if
that neighbor is occupied, or otherwise info about 2-away point just
beyond that neighbor if the neighbor itself is empty.
Currently there are about 250k patterns. Many of these are very rarely
used however; this is just the set of patterns that occurred in the 20k
self-play games. I'm calling these the self-play patterns below.
Here are some greenpeep win rates againt GnuGo 3.6 level 8 with 7.5 komi.
Each of these is with greenpeep playing black, with 10k playouts per move.
Just to note, these shouldn't be taken as a set of well-controlled
various kinds of tuning may have taken place between these results.
1) basic UCT with self-play patterns in the MC playouts. This had no other
biases for locality, captures, etc., and did not use patterns to bias UCT
and did not use all-moves-as-first: 24%
2) like (1), but replacing self-play patterns by an implementation of MoGo's
full scheme for MC playouts (as described in the MoGo report RR-6062);
this doesn't use the self-play patterns at all: 47%
3) like (2), but using the self-play patterns to bias the MoGo MC scheme's
final fallback to random global moves: 54%
4) to (3), add self-play pattern bias in UCT tree, and all-moves-as-first:
5) Like (4), but replacing MoGo's local-move patterns by locally restricted
self-play patterns (so this is no longer using MoGo's hand-coded patterns,
but still uses the explicit preference for captures/saves and local moves):
computer-go mailing list