Nice article, I love reading articles like that. I didn't see anything there I clearly disagreed with although I was expecting to see this.
I think the difference between heavy and light (uniform random) play-outs is fairly fixed. In other words heavy may be some fixed number of ELO points stronger regardless of the depth of the search. (This isn't clear, but my graph seemed to indicate a more or less parallel line.) The statement "will never give a strong computer go program." is rather devoid of meaning. You either should define "strong" or give some point of reference. If you simply mean other ways are better (such as heavy play-outs) then you have not imparted any useful information. I definitely agree that once you've played a few thousand uniformly random games, there is little to be gained by doing a few thousand more. And as an evaluation function this is a relatively weak one - although surprisingly good in some ways it has definite limitations. AnchorMan hits the wall at about 5,000 simulations and it is uniformly random with no other search involved. It would not be much stronger even with infinite number of simulations. The way to think about a play-out policy is to ask, "how good would it be given an infinite number of simulations?" The answer for uniform random is, "not very." - Don On Thu, 2007-07-26 at 11:54 +0900, Darren Cook wrote: > A couple of months back I wrote an article on why I believe UCT with > random playouts (as opposed to heavy playouts) will never give a strong > computer go program. I've finally got it finished, edited and published: > http://dcook.org/compgo/article_the_problem_with_random_playouts.html > > I'd be surprised if the UCT experts on this list will find much new > there, but I hope some people will find it of value. > > Thanks to Magnus Persson for reviewing an earlier version. > > Darren > > > _______________________________________________ computer-go mailing list [email protected] http://www.computer-go.org/mailman/listinfo/computer-go/
