> My problem is that I can't find many papers about learning of MC playout > policies, in particular patterns. >
A just published paper about learning MC policies: http://hal.inria.fr/inria-00456422/fr/ It works quite well for Havannah (not tested on hex I think). But in the case of Go, the Wang-policy is too strong for being improved like that. Incidentally, the big improvements when compared to Yizao Wang's policy in mogo are: - fill board (David, Martin, I hope that now it works well for you - for us it's extremely efficient for long computation times in 19x19 (but not for short time settings!), i.e. randomization in the Monte-Carlo part for "jumping" to empty parts of the goban) - nakade (there are other improvements, but much smaller: e.g. approach moves) (fill board and nakade in http://hal.inria.fr/inria-00386477/) Best regards, Olivier
_______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
