Re: [computer-go] Go/Games with modified scores

terry mcintyre Wed, 15 Oct 2008 13:47:18 -0700

> Darren Cook wrote:

> > What do your program's playouts think when presented with the board
> > position in the article? This is a terminal position, both players have
> > passed, a comfortable white win, yet pure random playouts think black
> > will win more often.
> > 
> >>>  http://dcook.org/compgo/article_the_problem_with_random_playouts.html


Looking at this position, I asked "how do humans look at this situation?"

As Darren said, we don't extrapolate all possible lines of play to the end. 
Instead, humans create a sort of proof which condenses vast numbers of playouts 
into a few simple thoughts.

For instance, at the top of the board, there are two black stones in atari. One 
of the neighboring white groups has two liberties; the other has three. 
Therefore, black cannot save his stones by a direct capture. Suppose black 
makes a threat - such as putting the white stones in the upper left into atari. 
Such a threat can be answered by capturing the two black stones, gaining two 
extra liberties for the white group. A short tree search confirms that black 
cannot pursue that line. The other alternative is for the two black stones to 
extend to the edge of the board. A short tree analysis shows the futility of 
that line of play. This sort of analysis is excruciatingly obvious to a human 
player. Similar analysis determines the status of the other groups, and the 
four singlet stones - there is no prayer of changing the outcome, assuming 
proper play by the opponent.

This is an end-game situation; the outcome already blindingly obvious. Back up 
a few moves, the position is a bit more open, the min-max trees become a bit 
more bushy, but it's still tractable for ordinary mid-kyu human players.

Is it possible to incorporate information gleaned from such directed proofs of 
life-death status into playouts, so that the best line of play is automatically 
followed? Supposing that min-max search reveals that B is the best response to 
A, while C or D would "snatch defeat from the jaws of victory", can the 
playouts be discouraged from following A with C or D -- until such time as the 
situation has changed in a provable sense?

When I spoke of "a short tree analysis", in human terms, that tree is short 
because we exclude dumb moves. When a group is in atari at the top left, we 
don't try to save it by playing at random locations in the middle or bottom 
right; we know from the shape that some moves have a chance of altering the 
outcome, and others could not concievably make a difference. We give up some 
stones and save others; sometimes we deliberately sacrifice a few to gain 
advantages. Our analysis, at a fundamental level, depends on the basic rule: 
players take turns. You don't get to make two moves in a row; instead, you make 
a threat which forces the move you want, causing the weakness which you exploit.

Much of what we know about life and death can easily be expressed in the form 
of conditional move trees or state machines -- if Black plays A, respond with 
B; switch states; follow C with D; etc.

Would such a state machine integrate well with random playouts? It would help 
if the state machine can be probabilistic; sometimes you definitely want to 
save that 25-point group, other times you are willing to think about giving up 
10 points here to gain 12 there - perhaps there is a ko fight in progress. What 
is at stake? What is being risked?


      
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Go/Games with modified scores

Reply via email to