> My problem is that I can't find many papers about learning of MC playout
> policies, in particular patterns.
>

A just published paper about learning MC policies:
http://hal.inria.fr/inria-00456422/fr/
It works quite well for Havannah (not tested on hex I think).

But in the case of Go, the Wang-policy is too strong for being improved like
that.

Incidentally, the big improvements when compared to Yizao Wang's policy in
mogo are:

- fill board (David, Martin, I hope that now it works well for you - for us
it's extremely efficient for long computation
    times in 19x19 (but not for short time settings!), i.e. randomization in
the Monte-Carlo part for
     "jumping" to empty parts of the goban)

- nakade

(there are other improvements, but much smaller: e.g. approach moves)

(fill board and nakade in http://hal.inria.fr/inria-00386477/)

Best regards,
Olivier
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Reply via email to