Re: [computer-go] Dependency inversions in Monte-Carlo Go

Łukasz Lew Wed, 08 Apr 2009 08:03:29 -0700

On Wed, Apr 8, 2009 at 14:41, Jason House <jason.james.ho...@gmail.com> wrote:
> On Apr 8, 2009, at 3:15 AM, Łukasz Lew <lukasz....@gmail.com> wrote:
>
>> On Tue, Apr 7, 2009 at 23:52, Claus Reinke <claus.rei...@talk21.com>
>> wrote:
>>>
>>> Last time I looked more closely at what my MC bot (simple, no tree)
>>> was doing, I noticed that it has a tendency to try the impossible moves
>>> (suicides) first, discarding them as impossible for every genmove.
>>>
>>> Looking at more statistics, and thinking about it, it finally dawned on
>>> me that this is a consequence of the standard move evaluation approach
>>> based on win rate:
>>>
>>> - what one hopes for is the move with the best chance of winning
>>>   (move enables win)
>>> - what one might get is a move that can only be played when winning
>>>   (win enables move)
>>>
>>> For suicide moves, this is particularly obvious: they become possible
>>> only in the small number of games in which the opponent group
>>> surrounding the pseudo-eye has died (or is killed by playing in the
>>> pseudo-eye after removing all other liberties). The larger the group,
>>> the more likely that the game is going to be won if that group dies
>>> (roughly), so the larger the opponent group, the more tempting its
>>> pseudo-eyes seem for win-rate-based evaluation, however unlikely
>>> it is that the group actually dies in non-random play (certainly not
>>> by starting with the pseudo-eye).
>>>
>>> Something similar might be happening at a less obvious scale,
>>> such as playing a move into self-atari: there is one opponent
>>> move that renders this useless, but it is only one move - any
>>> game in which the stone is not captured might look rather
>>> good in terms of winning.
>>
>> Good insight, well known too.
>>
>>>
>>> Is there a known method of reducing the impact of these outliers
>>> (other than building a real tree and seeing by trial-and-error
>>> that many good-looking moves aren't really useful)?
>>
>> Heavy playouts introduce lot's of knowledge to avoid moves that can be
>> easily and with high probability detected that are bad.
>>
>> If you are asking about domain-independent techniques, then only MCTS
>> and AMAF/RAVE are well known.
>> Nothing else is in popular. But more and more people are thinking how
>> to make adaptive playouts.
>
> Heavy playouts aren't the only way. Initialization heuristics and
> progressive widening also work.


Indeed :)


>
>
>>> It seems
>>> that one cannot just devalue moves with low hit counts - after
>>> all, if there is only one sequence of moves that will rescue the
>>> game, one doesn't want to discount it just because it is rare.
>>>
>>> One thing that might help are move-number statistics: those
>>> moves that tend to be played late in the playouts in which
>>> they occur might depend on other moves to be played first,
>>> so perhaps one should have lower bounds on when each
>>> move can be considered for actual play?
>>
>> In light playout moves are played at random, so the moment of playing
>> a move doesn't carry too much information.
>>
>> Lukasz
>>
>>>
>>> Claus
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> computer-go mailing list
>>> computer-go@computer-go.org
>>> http://www.computer-go.org/mailman/listinfo/computer-go/
>>>
>> _______________________________________________
>> computer-go mailing list
>> computer-go@computer-go.org
>> http://www.computer-go.org/mailman/listinfo/computer-go/
>
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dependency inversions in Monte-Carlo Go

Reply via email to