[computer-go] Some thoughts about Monte Carlo

Mark Boon Fri, 18 Jan 2008 05:03:45 -0800

I'm fairly new on the subject of Monte Carlo and am in the process ofcatching up on reading, so I hope you guys have some patience with mewhile I get into this and ask a lot of questions. I got side-trackedaway from computer-Go programming for quite a while for variousreasons but have been at it again full-time recently.

Let me start by saying I haven't followed the whole Monte-Carlo / UCTstory all that well. Although it was clear it did well on 9x9 Iwasn't all that convinced it would scale up to 19x19. That it didwell on 9x9 didn't surprise me much as I've always felt 9x9 issusceptible to see strong play using some extreme brute-force methodusing today's computing power.

Combining Monte-Carlo methods with what's generally called "heavyplayouts" changes things a lot however and this is where it piqued myinterest. I often see reference to Monte-Carlo programs as opposed to"traditional" Go programs. But what I'm seeing with the heavyplayouts is "deja-vu all over again". It seems to me the evolution ofthe heavy playouts is following a similar course of that of the firstrule-based Go programs in the early 1980's. To use another famousquote "history doesn't repeat itself but it rhymes."

So I wouldn't be surprised at all if at some point you'll see amarriage of the best ideas of traditional Go programs and Monte-Carlo / UCT. In fact, this is most likely already happening as theseMonte-Carlo programs use algorithms / ideas from the traditionalprograms for tactics, pattern-matching and possibly others.. And I goout on a limb to predict that at some point "heavy playouts" will useinformation like territory, eye-space and group-strength to guidetheir playouts just like the "traditional" programs do. And thatusing fixed 3x3 patterns will be a passing fad.

Anyway, as it is I have made a few initial forrays into Monte-Carloalgorithms. I hereby thank Peter Drake on his work on Orego, which isa fine package to learn about Monte-Carlo. I used it to make my ownimplementation. Not that there was much wrong with Orego but the bestway to understand and internalize a concept is to actually implementit. Moreover I felt I needed a framework that is a bit moregeneralized than Orego. Lastly I was convinced of course that I'd beable to make it faster. That last part didn't turn out to be truethough. On my iMac, which is a 2Ghz Core Duo, I get a little under30K playouts on 9x9. A little over 50K when using both cores. Oregois in the same ball-park. At least making a more generalizedimplementation didn't hurt performance so I'm OK with this firstattempt. I think there's still room for optimizations, if I thoughtit was important at this stage.

This comes to my first point. Optimizing early in a project is likelistening to the devil. It eats up a lot of time, the visibleprogress is gratifying but in the grand scheme of things it's not allthat important to do early. I implemented it using pseudo-libertiesbecause... uh, well, because that's what everyone seemed to be doingto get high numbers of playouts per second. But I already start todoubt using pseudo-liberties are all that useful. Is anybody stillusing pure-random playouts for anything? As soon as you start to doanything more, pseudo-liberties are pretty useless. Yes, using realliberties slows things down a lot. But the speed of the randomplayout becoms less and less important with heavy playouts.

Next I was thinking about another subject that got some attention onthis list, the mercy rule. It seems to save about 10% in the numberof moves per game (on 19x19) and result in about 20% gain inperformance. This discrepancy is most likely due to the fact that theadministration, whether using pseudo-liberties or real, is muchslower towards the end of the game because you have more moves thatmerge chains. And those 10% moves it saves are of course always atthe end. So is it relevant? I don't know whether heavy playouts willbe slower towards the end of the game or not. Possibly yes, as moremoves made will have a small number of liberties that will needtactical analysis. I'd say that generally reducing the move-count isa good thing whichever method one uses. Possibly at a later stagemore sophisticated methods can be developed to abort a game early.

Lastly (for this post anyway) I looked at the multiple-suicidequestion. Whether allowing it or not seems to have negligable effecton performance. What I did first was to tune the komi until the win-ratio for Black and White is as close to 50% as possible using randomplayouts. The number ended up being 2. Next I allowed suicide onlyfor White. This pushed Black's winning percentage up to 65%. I thinkthis pretty much settles the matter and it's clear allowing suicideis very detrimental to quality of play in a Monte-Carlo program.


That's all for today :)

Mark

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

[computer-go] Some thoughts about Monte Carlo

Reply via email to