date:20090301

Re: [computer-go] Published source for mercy rule?

2009-03-01 Thread Łukasz Lew

mercy rule in libego:
http://github.com/lukaszlew/libego/blob/77b4dd4e035e4b44c17204557c9930a52e10e0c0/ego/playout.h
line 55.

Regards,
Łukasz Lew

On Fri, Feb 27, 2009 at 01:00, Seth Pellegrino se...@lclark.edu wrote:
 Hello list,

 I've managed to track the idea of a mercy rule in monte-carlo playouts back
 to a mail sent to this list by David Hillis:

 http://computer-go.org/pipermail/computer-go/2006-December/007478.html

 I'm currently putting the finishing touches on a paper which includes this
 idea, and I was wondering if anyone knew of any published sources where I
 could cite the idea?

 Thanks,

 Seth Pellegrino

 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Published source for mercy rule?

2009-03-01 Thread Don Dailey

Has anyone done a good analysis on the value of the mercy rule?   It is
a major or minor speedup?   And how does it affect the quality of the
search?   Is it more useful with bigger board sizes?  

It seems like I remember seeing that many believed it to be a very minor
improvement if any,  but I'm curious if that is the current
understanding (as I have not experimented with it myself.)

- Don
 

On Sun, 2009-03-01 at 18:28 +0100, Łukasz Lew wrote:
 mercy rule in libego:
 http://github.com/lukaszlew/libego/blob/77b4dd4e035e4b44c17204557c9930a52e10e0c0/ego/playout.h
 line 55.
 
 Regards,
 Łukasz Lew
 
 On Fri, Feb 27, 2009 at 01:00, Seth Pellegrino se...@lclark.edu wrote:
  Hello list,
 
  I've managed to track the idea of a mercy rule in monte-carlo playouts back
  to a mail sent to this list by David Hillis:
 
  http://computer-go.org/pipermail/computer-go/2006-December/007478.html
 
  I'm currently putting the finishing touches on a paper which includes this
  idea, and I was wondering if anyone knew of any published sources where I
  could cite the idea?
 
  Thanks,
 
  Seth Pellegrino
 
  ___
  computer-go mailing list
  computer-go@computer-go.org
  http://www.computer-go.org/mailman/listinfo/computer-go/
 
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Published source for mercy rule?

2009-03-01 Thread dhillismail

Bruno Bouzy gives it credit for a 1% strength improvement in the tutorial:
Old-fashioned Computer Go vs Monte-Carlo Go
http://www.math-info.univ-paris5.fr/~bouzy/publications/CIG07-tutorial-1avril2007.pdf
(It's a pretty good read in general, btw.)

The Mercy rule interacts with ownership maps and RAVE in ways that are a bit of 
a nuisance.

- Dave Hillis


-Original Message-
From: Don Dailey dailey@gmail.com
To: computer-go computer-go@computer-go.org
Sent: Sun, 1 Mar 2009 1:51 pm
Subject: Re: [computer-go] Published source for mercy rule?



Has anyone done a good analysis on the value of the mercy rule?   It is
 major or minor speedup?   And how does it affect the quality of the
earch?   Is it more useful with bigger board sizes?  
It seems like I remember seeing that many believed it to be a very minor
mprovement if any,  but I'm curious if that is the current
nderstanding (as I have not experimented with it myself.)
- Don

On Sun, 2009-03-01 at 18:28 +0100, Łukasz Lew wrote:
 mercy rule in libego:
 
http://github.com/lukaszlew/libego/blob/77b4dd4e035e4b44c17204557c9930a52e10e0c0/ego/playout.h
 line 55.
 
 Regards,
 Łukasz Lew
 
 On Fri, Feb 27, 2009 at 01:00, Seth Pellegrino se...@lclark.edu wrote:
  Hello list,
 
  I've managed to track the idea of a mercy rule in monte-carlo playouts back
  to a mail sent to this list by David Hillis:
 
  http://computer-go.org/pipermail/computer-go/2006-December/007478.html
 
  I'm currently putting the finishing touches on
 a paper which includes this
  idea, and I was wondering if anyone knew of any published sources where I
  could cite the idea?
 
  Thanks,
 
  Seth Pellegrino
 
  ___
  computer-go mailing list
  computer-go@computer-go.org
  http://www.computer-go.org/mailman/listinfo/computer-go/
 
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/
___
omputer-go mailing list
omputer...@computer-go.org
ttp://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

RE: [computer-go] re-CGT approximating the sum of subgames

2009-03-01 Thread Philipp Hennig


Hi Martin, Dave and others,

I have never posted to this list before (so, hi everyone! :-), but  
your discussion on approximate CGT caught my eye.


Dave. you mentioned that you are interested in trying to mix UCT and  
CGT:

Dave Devos wrote on 19 Feb 2009, 19:33 GMT:
Ok, I'll look into your temperature discovery article. I noticed  
that Amazons is mentioned a lot in CGT, but I am not familiar with  
the game. I want to use montecarlo to discover subgames. Then I want  
to use CGT techniques to evaluate and sum subgames to get a full  
board evaluation. If this is possible, it could dramatically reduce  
the combinatorial explosion on large boards (like you said, totally  
killing full board search) This is the idea. I don't even know if it  
is possible, but I definitely want to give it a try.



I worked  on a similar idea during a short project early on in my PhD  
(which has since taken other directions). We (David Stern, Thore  
Graepel and myself) tried to use UCT to measure the temperature of  
Amazons games. You can find a technical report at


http://www.inference.phy.cam.ac.uk/ph347/CPGS-Report_Hennig.pdf
(I turned this into my first-year report, so you will have to excuse  
the length and the rambling at the beginning. If you know about UCT  
and CGT, you can directly jump to chapters 2.3 or 3.0).


The short of it is: The results were not very encouraging. The problem  
is that TDS is searching for an event (the switch from the stack of  
coupons to play on the board) which happens at some depth 1 in the  
game tree*, and UCT is not very good at performing inference on such  
quantities. The only thing it is really good at is inference about the  
value of the topmost nodes, where the statistical fidelity is good  
(nodes even a few steps deep into the tree will often only be visited  
a handful of times, so there is not much data available to make a good  
estimate from).
Further, adding a coupon stack to the overall game seems to sometimes  
create situations where the roll-outs become biased towards one  
player. In such situations, it can take UCT very long to discover the  
right solution, wasting a lot of search time expanding the wrong part  
of the tree. In chapter 5.1.1 of my report, you will find an  
experimental study of this behaviour using an artificial game tree.
Then, there are a few more subtleties to keep in mind. One is that Go  
games rarely are truly separable into sub-games (End-Games are an  
exception, as Martin showed in his paper mentioned in his earlier  
mail). Independence may hinge on just one particularly ingenious  
move deep in the tree and therefore might be hard to find by Monte- 
Carlo search (but this is pure conjecture, so don't let yourself be  
stopped from trying). Finally, searching for independence will cost  
time itself.
So, in the end, you need to search for independence, which takes time  
and introduces errors, because you might miss crucial connections  
between sub-games. Then you need to search for the temperature of each  
sub-game, which is more expensive than just searching for a good move  
in the sub-game, since the TDS tree is more complex than the board  
tree itself, and UCT needs way longer to build up any statistical  
evidence around the interesting bits deep in the search tree (where  
you also find the best next move). It also causes more uncertainty,  
since UCT-based TDS will have an unknown error. And, finally, playing  
HOTSTRAT is in itself an approximation to optimal play, although  
apparently a good one, and with bounded error.
UCT on the whole board, on the other hand, has to struggle with the  
combinatorial explosion of the tree. But we know it converges to  
optimal play, eventually.
Bottom line: I would be interested to hear about your results. All of  
my arguments above are obviously not very quantitative, and it might  
still be possible to solve Go by Divide and Conquer (humans do it,  
apparently), so I'm not saying you shouldn't try, not at all. But just  
naively mixing TDS and UCT, as I did, is not recommended. :-)


Good luck with your experiments!

Philipp

*more precisely, the optimal shift happens at depth ~ ((T_max - T) /  
delta), where T is the temperature of the game, T_max is the value of  
the top-most coupon and delta is the resolution of the measurement  
(i.e. the difference between face values of consecutive coupons),  
because you will typically only have an upper bound of the true  
temperature, and the value of the topmost coupon on the stack has to  
be larger than the temperature. Also, TDS robustness to increasing  
delta or varying delta over the stack is not entirely understood, (see  
Martin's original TDS paper). In particular, something like a binary  
search for the right Temperature (starting with a huge delta, and  
subsequently refining the stack) didn't work for us. 
___

computer-go mailing list
computer-go@computer-go.org

Re: [computer-go] Published source for mercy rule?

Re: [computer-go] Published source for mercy rule?

Re: [computer-go] Published source for mercy rule?

RE: [computer-go] re-CGT approximating the sum of subgames

4 matches

Site Navigation

Mail list logo

Footer information