Re: [Computer-go] Computer-go Digest, Vol 13, Issue 7

Hendrik Baier Fri, 04 Feb 2011 14:46:42 -0800

Hi Oliver,

unfortunately it's not easy to understand the effects of playout policymodifications on the behavior of MCTS. Some variations indeed refresh sorarely that their suggestions are pretty old most of the time. However,old doesn't necessarily mean outdated, as some types of informationappear to be quite valuable throughout the search tree. It's hard topredict how valuable and how generalizable a given type of informationis going to be on average.

An example from our paper: Move replies to single moves (LGRF-1) areused in 27.1% of moves. They stay in the reply table for on average only4.2 playouts - they are overwritten or deleted constantly. Move repliesto pairs of moves (LGRF-2) however are applied just as often (27.7%),but remain in the table for on average 1700 playouts! Yet they stillprovide very useful information, as we can see from the playing strength.


Hendrik

   Hendrik - did you look at any metrics on the variations to see if you could establish 
why most of them were not successful?  I was wondering if looking at the percentage of 
suggestions made by the policy or the refresh rate would suggest what the problem is with 
some of the others.  For example, a policy which is providing a "good reply" 
nearly 100% of the time with a low refresh rate is probably too narrow for good 
exploration.

   Oliver


_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] Computer-go Digest, Vol 13, Issue 7

Reply via email to