Re: [Computer-go] alarming UCT behavior

Gonçalo Mendes Ferreira Fri, 06 Nov 2015 10:59:33 -0800

That doesn't seem very realistic. I'd guess your prior values areaccurate but the simulations are biased or not representative. Or youmiss precision in your transition quality floating points. Or there's abug related to being an adversarial problem and you didn't have therobots swap colors? :)


Do tell what it was when you discover the problem.


Gonçalo F.

On 06/11/2015 18:48, Dave Dyer wrote:

Developing a UCT robot for a new game, I have encountered a
surprising and alarming behavior:  the longer think time the
robot is given, the worse the results.  That is, the same robot
given 5 seconds per move defeats one give 30 seconds, or 180 seconds.

I'm still investigating, but the proximate cause seems to be
my limit on the size of the UCT tree.   As a memory conservation
measure, I have a hard limit on the size of the stored tree. After
the limit is reached, the robot continues running simulations, refining
the outcomes based on the existing tree and random playouts below
the leaf nodes.

My intuition would be that the search would be less effective in this
mode, but producing worse results (as measured by self-play) is
strongly counter intuitive.

Does it apply to Go?  Maybe not, but it's at least an indicator
that arbitrary decisions that "ought to" be ok can be very bad in
practice.


_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] alarming UCT behavior

Reply via email to