If you remove the limit on the tree size, does this still occur?

Otherwise, I agree with Gonçalo - it seems unlikely. When I implemented
Double Step [1] and had weird results I forgot to switch the perspective
of the bots and even worse some of the move generation was buggy.


[1] a theoretical/simple game used in one of my favorite papers to show
the laziness of monte carlo tree search when in front

On 11/06/2015 07:59 PM, Gonçalo Mendes Ferreira wrote:
> That doesn't seem very realistic. I'd guess your prior values are
> accurate but the simulations are biased or not representative. Or you
> miss precision in your transition quality floating points. Or there's
> a bug related to being an adversarial problem and you didn't have the
> robots swap colors? :)
> Do tell what it was when you discover the problem.
> Gonçalo F.
> On 06/11/2015 18:48, Dave Dyer wrote:
>> Developing a UCT robot for a new game, I have encountered a
>> surprising and alarming behavior:  the longer think time the
>> robot is given, the worse the results.  That is, the same robot
>> given 5 seconds per move defeats one give 30 seconds, or 180 seconds.
>> I'm still investigating, but the proximate cause seems to be
>> my limit on the size of the UCT tree.   As a memory conservation
>> measure, I have a hard limit on the size of the stored tree. After
>> the limit is reached, the robot continues running simulations, refining
>> the outcomes based on the existing tree and random playouts below
>> the leaf nodes.
>> My intuition would be that the search would be less effective in this
>> mode, but producing worse results (as measured by self-play) is
>> strongly counter intuitive.
>> Does it apply to Go?  Maybe not, but it's at least an indicator
>> that arbitrary decisions that "ought to" be ok can be very bad in
>> practice.
> _______________________________________________
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go

Computer-go mailing list

Reply via email to