I often found similar for Zen with enough memory.  I and Yamato's 
interpretation follows.  

Yamato tunes many parameters with relatively shorter time settings 
because a huge number of games are necessary for the tuning.  Assuming 
the prior bias and the average outcome from the simulations have to be 
balanced (using multiple parameters)  for strong play: patterns give 
good shapes but almost nothing for local fights and complicated L&D 
which should be solved by the simulations.  Yamato optimized these 
parameters at a shorter time setting and so the balance is broken (i.e., 
worse than optimum) at longer time settings.

Recently, Yamato tunes the parameters at as long time setting as 


Dave Dyer: <20151106184850.13268e1...@computer-go.org>:
>Developing a UCT robot for a new game, I have encountered a
>surprising and alarming behavior:  the longer think time the
>robot is given, the worse the results.  That is, the same robot
>given 5 seconds per move defeats one give 30 seconds, or 180 seconds.
>I'm still investigating, but the proximate cause seems to be
>my limit on the size of the UCT tree.   As a memory conservation
>measure, I have a hard limit on the size of the stored tree. After
>the limit is reached, the robot continues running simulations, refining
>the outcomes based on the existing tree and random playouts below
>the leaf nodes.
>My intuition would be that the search would be less effective in this
>mode, but producing worse results (as measured by self-play) is 
>strongly counter intuitive.
>Does it apply to Go?  Maybe not, but it's at least an indicator
>that arbitrary decisions that "ought to" be ok can be very bad in
>Computer-go mailing list
Hideki Kato <mailto:hideki_ka...@ybb.ne.jp>
Computer-go mailing list

Reply via email to