Re: [computer-go] Dependency inversions in Monte-Carlo Go

Jason House Wed, 08 Apr 2009 05:42:31 -0700

On Apr 8, 2009, at 3:15 AM, Łukasz Lew <lukasz....@gmail.com> wrote:

On Tue, Apr 7, 2009 at 23:52, Claus Reinke <claus.rei...@talk21.com>wrote:

Last time I looked more closely at what my MC bot (simple, no tree)

was doing, I noticed that it has a tendency to try the impossiblemoves

(suicides) first, discarding them as impossible for every genmove.

Looking at more statistics, and thinking about it, it finallydawned onme that this is a consequence of the standard move evaluationapproach

based on win rate:

- what one hopes for is the move with the best chance of winning
   (move enables win)
- what one might get is a move that can only be played when winning
   (win enables move)

For suicide moves, this is particularly obvious: they become possible
only in the small number of games in which the opponent group
surrounding the pseudo-eye has died (or is killed by playing in the
pseudo-eye after removing all other liberties). The larger the group,
the more likely that the game is going to be won if that group dies
(roughly), so the larger the opponent group, the more tempting its
pseudo-eyes seem for win-rate-based evaluation, however unlikely
it is that the group actually dies in non-random play (certainly not
by starting with the pseudo-eye).

Something similar might be happening at a less obvious scale,
such as playing a move into self-atari: there is one opponent
move that renders this useless, but it is only one move - any
game in which the stone is not captured might look rather
good in terms of winning.


Good insight, well known too.


Is there a known method of reducing the impact of these outliers
(other than building a real tree and seeing by trial-and-error
that many good-looking moves aren't really useful)?


Heavy playouts introduce lot's of knowledge to avoid moves that can be
easily and with high probability detected that are bad.

If you are asking about domain-independent techniques, then only MCTS
and AMAF/RAVE are well known.
Nothing else is in popular. But more and more people are thinking how
to make adaptive playouts.

Heavy playouts aren't the only way. Initialization heuristics andprogressive widening also work.

It seems
that one cannot just devalue moves with low hit counts - after
all, if there is only one sequence of moves that will rescue the
game, one doesn't want to discount it just because it is rare.

One thing that might help are move-number statistics: those
moves that tend to be played late in the playouts in which
they occur might depend on other moves to be played first,
so perhaps one should have lower bounds on when each
move can be considered for actual play?


In light playout moves are played at random, so the moment of playing
a move doesn't carry too much information.

Lukasz


Claus




_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dependency inversions in Monte-Carlo Go

Reply via email to