Re: [computer-go] Some thoughts about Monte Carlo

Magnus Persson Fri, 18 Jan 2008 09:39:27 -0800

So I wouldn't be surprised at all if at some point you'll see a
marriage of the best ideas of traditional Go programs and Monte-Carlo /
UCT. In fact, this is most likely already happening as these
Monte-Carlo programs use algorithms / ideas from the traditional
programs for tactics, pattern-matching and possibly others.. And I go
out on a limb to predict that at some point "heavy playouts" will use
information like territory, eye-space and group-strength to guide their
playouts just like the "traditional" programs do. And that using fixed
3x3 patterns will be a passing fad.

My first MC-Program actually used the code from my knowledge intensiveprogram Viking. Then I rewrote everything from scratch which isValkyria2.7. And I am now working with rewriting the search part inorder to allow threaded search and that is Valkyria3.

Since Viking used territory, eye-space etc to compute groupstrength, Iguess I could comment on what I believe can and cannot be used inheavy playouts.

Valkyria uses eye-space calculations, which prunes really bad moves inthe playouts. My board representation keeps track of all liberties,and surrounding stones for all chains. It is designed for playouts(undo is not possible) but a lot of information is computed so that atit is easy to write "knowledge"-algorithms. It is faster than my oldVikingcode but still it is much easier to work with.

Valkyria does not do explicit 3x3 patternmatching, but uses handcodedpatternmatching which often are equivalent to 3x3 patterns. Butsometimes the code will look at a larger area. It also computesladders in playouts. The ladders are fast and sloppy, that is, in somesituations they make mistakes, but in the playouts one makes (andshould make) compromises with correctness all the time. It is theUCT-search that should find out how to play correctly not the playouts.

My program as a consequence is probably the slowest in terms ofplayouts/second. It does about 1200/sec on 9x9 on Pentium-M 1.4Ghz.

The good thing is that the code is so bloated I do not slow it down somuch when I add new stuff to the playouts. I recently found a lot ofbugs working with version 3 so I added that to the 2.7.8 which hasbeen rated about 40-60 Elo higher than 2.7.7.

I do not think that territory and groupstrength will ever be includedinto heavy playouts. Because this is implicitly computet by theMC-evaluation. Several MC-programs today collects statistics such asthe probability that each position on the board is assigned to blackin the end of playouts. I recently added a colored graph of suchstatistics to Valkyria3 and it looks really pretty. You can easily seewhat is safe territory or not and which groups are weak and not.Actually this visualisation is good to hunt for bugs in the playouts.If a groups is almost dead despite having plenty of eyspace thensomething can probably be fixed in the playouts. These kind ofstatistics looke quite nice already after as little as 100 playouts,and are in my opinion better than what I ever achieved with Vikingevaluation function. Most importantly one sees that althoughMC-evaluation often misjudges positions, the effect of moves is almostalways goes in the right directions. Traditional statical evaluationhas the problem of being brittle. It does 90% right but 10% completelywrong. MC-eval does 80 % right and 20% slightly wrong. (I made upthose numbers of course but hope you get the idea)

So computing territory and groups strength in playouts appears to meas reinventing the wheel because this is what MC-eval actually isdoing implicitely. An open question is to what extent these statistcscan be used to improve performance in UCT-search or biasing playouts.I have no clear idea about it because, in some versions of my programsit was good and sometime it did not. It may be that UCT-search itselfis so efficient that whenever you have reliable statistics for aposition the search does not benefit from that knowledge anymore.

I also agree that some really old techniques do come back. In my firstgoporgram for the Amiga I wrote 18 years ago approximately I firstimplemented the following pattern


?B?
W+W
?B?

and variations thereof. When I first made the playouts heave this wasthe first patterns I added, and it worked really well. So the irony isthat my current programs have more in common with my very firstcompletely stupid program than the versions in between...


-Magnus


_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Some thoughts about Monte Carlo

Reply via email to