Re: [computer-go] Some thoughts about Monte Carlo

dhillismail Fri, 18 Jan 2008 10:46:17 -0800

If you like heavy playouts, you will love progressive widening. It all comes 
down to move prioritization and it's OK if it takes a little time to calculate. 
If I had a good traditional program (and I wish I did) I would focus my efforts 
to combine it with UCT-MC?here.


I found it helpful to read these 2 papers (about CrazyStone and Mango) 
side-by-side.
http://remi.coulom.free.fr/Amsterdam2007/MMGoPatterns.pdf?section 4.2 especially
and 
http://www.cs.unimaas.nl/g.chaslot/papers/pMCTS.pdf
Mogo does it too, though from a different angle.
http://www.machinelearning.org/proceedings/icml2007/papers/387.pdf

About the mercy rule: When it is invoked, it doesn't only shorten the playout, 
but it also obviates the need to score the board. That might help explain your 
timing measurements. It has one significant drawback in that it will mess up 
any statistics re. percentage of time black owns a position on the board. I 
think it is safe to set a relatively aggressive threshold in the beginning game 
(the actual game), but best to tighten it up later, when mistakes are more 
likely to favor one side over the other.

- Dave Hillis

-----Original Message-----
From: Mark Boon <[EMAIL PROTECTED]>
To: computer-go <[email protected]>
Sent: Fri, 18 Jan 2008 8:03 am
Subject: [computer-go] Some thoughts about Monte Carlo


I'm fairly new on the subject of Monte Carlo and am in the process of catching 
up on reading, so I hope you guys have some patience with me while I get into 
this and ask a lot of questions. I got side-tracked away from computer-Go 
programming for quite a while for various reasons but have been at it again 
full-time recently.?
?
Let me start by saying I haven't followed the whole Monte-Carlo / UCT story all 
that well. Although it was clear it did well on 9x9 I wasn't all that convinced 
it would scale up to 19x19. That it did well on 9x9 didn't surprise me much as 
I've always felt 9x9 is susceptible to see strong play using some extreme 
brute-force method using today's computing power.?
?
Combining Monte-Carlo methods with what's generally called "heavy playouts" 
changes things a lot however and this is where it piqued my interest. I often 
see reference to Monte-Carlo programs as opposed to "traditional" Go programs. 
But what I'm seeing with the heavy playouts is "deja-vu all over again". It 
seems to me the evolution of the heavy playouts is following a similar course 
of that of the first rule-based Go programs in the early 1980's. To use another 
famous quote "history doesn't repeat itself but it rhymes."?
?
So I wouldn't be surprised at all if at some point you'll see a marriage of the 
best ideas of traditional Go programs and Monte-Carlo / UCT. In fact, this is 
most likely already happening as these Monte-Carlo programs use algorithms / 
ideas from the traditional programs for tactics, pattern-matching and possibly 
others.. And I go out on a limb to predict that at some point "heavy playouts" 
will use information like territory, eye-space and group-strength to guide 
their playouts just like the "traditional" programs do. And that using fixed 
3x3 patterns will be a passing fad.?
?
Anyway, as it is I have made a few initial forrays into Monte-Carlo algorithms. 
I hereby thank Peter Drake on his work on Orego, which is a fine package to 
learn about Monte-Carlo. I used it to make my own implementation. Not that 
there was much wrong with Orego but the best way to understand and internalize 
a concept is to actually implement it. Moreover I felt I needed a framework 
that is a bit more generalized than Orego. Lastly I was convinced of course 
that I'd be able to make it faster. That last part didn't turn out to be true 
though. On my iMac, which is a 2Ghz Core Duo, I get a little under 30K playouts 
on 9x9. A little over 50K when using both cores. Orego is in the same 
ball-park. At least making a more generalized implementation didn't hurt 
performance so I'm OK with this first attempt. I think there's still room for 
optimizations, if I thought it was important at this stage.?
?
This comes to my first point. Optimizing early in a project is like listening 
to the devil. It eats up a lot of time, the visible progress is gratifying but 
in the grand scheme of things it's not all that important to do early. I 
implemented it using pseudo-liberties because... uh, well, because that's what 
everyone seemed to be doing to get high numbers of playouts per second. But I 
already start to doubt using pseudo-liberties are all that useful. Is anybody 
still using pure-random playouts for anything? As soon as you start to do 
anything more, pseudo-liberties are pretty useless. Yes, using real liberties 
slows things down a lot. But the speed of the random playout becoms less and 
less important with heavy playouts.?
?
Next I was thinking about another subject that got some attention on this list, 
the mercy rule. It seems to save about 10% in the number of moves per game (on 
19x19) and result in about 20% gain in performance. This discrepancy is most 
likely due to the fact that the administration, whether using pseudo-liberties 
or real, is much slower towards the end of the game because you have more moves 
that merge chains. And those 10% moves it saves are of course always at the 
end. So is it relevant? I don't know whether heavy playouts will be slower 
towards the end of the game or not. Possibly yes, as more moves made will have 
a small number of liberties that will need tactical analysis. I'd say that 
generally reducing the move-count is a good thing whichever method one uses. 
Possibly at a later stage more sophisticated methods can be developed to abort 
a game early.?
?
Lastly (for this post anyway) I looked at the multiple-suicide question. 
Whether allowing it or not seems to have negligable effect on performance. What 
I did first was to tune the komi until the win-ratio for Black and White is as 
close to 50% as possible using random playouts. The number ended up being 2. 
Next I allowed suicide only for White. This pushed Black's winning percentage 
up to 65%. I think this pretty much settles the matter and it's clear allowing 
suicide is very detrimental to quality of play in a Monte-Carlo program.?
?
That's all for today :)?
?
Mark?
?
_______________________________________________?
computer-go mailing list?
[EMAIL PROTECTED]
http://www.computer-go.org/mailman/listinfo/computer-go/?


________________________________________________________________________
More new features than ever.  Check out the new AIM(R) Mail ! - 
http://webmail.aim.com

_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Some thoughts about Monte Carlo

Reply via email to