Re: [computer-go] Transpositions in Monte-carlo tree search

Matthew Woodcraft Wed, 01 Apr 2009 12:04:06 -0700

Erik van der Werf wrote:
> >> Jonas Kahn wrote:
>> No there is no danger. That's the whole point of weighting with N_{s,a}.
>>
>> N_{s,a} = number of times the node s has been visited, starting with parent
>> a.
>>
>> You can write Value of a node a = (\sum_{s \in sons} N_{s,a} V_s) / (\sum
>> N_{s,a})
>>
>> where V_s is ideally the «true» value of node s.
>> In UCT2, they use V_s = Q_{s,a} the win average of simulations going
>> through a, and then through s.
>> In UCT3, they use V_s = Q_s the win average of all simulations through
>> s.


> There is a danger. The problem is that the selection policy also
> implements the soft-max like behavior that ensures convergence to the
> minimax result. If the you backup to all possible parents, including
> those for which the child would have been an inferior choice, you may
> get into trouble.

That's what I was worried about.

But I think it's ok the way Jonas describes above: you don't add
anything to the false-parent node's simulation count, and you don't
change the weight of the false-child in its value; you just change the
evaluation of the false-child.

(This means that the effect of backing up to alternate parents will be
smaller than the effect of backing up to the 'true' parent, which is
presumably part of the reason why this variant is less attractive.)

-M-
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Transpositions in Monte-carlo tree search

Reply via email to