I didn’t notice a difference. Like everyone else, once I had RAVE implemented and added biases to the tree move selection, I found the UCT term made the program weaker, so I removed it.
David > -----Original Message----- > From: Computer-go [mailto:[email protected]] On Behalf Of > Igor Polyakov > Sent: Tuesday, April 14, 2015 3:37 AM > To: [email protected] > Subject: [Computer-go] UCB-1 tuned policy > > I implemented UCB1-tuned in my basic UCB-1 go player, but it doesn't seem > like it makes a difference in self-play. > > It seems like it's able to run 5-25% more simulations, which means it's > probably exploiting deeper (and has less steps until it runs out of room to > play legal moves), but I have yet to see any strength improvements on > 9x9 boards. > > As far as I understand, the only thing that's different is the formula. > Has anyone actually seen any difference between the two algorithms? > _______________________________________________ > Computer-go mailing list > [email protected] > http://computer-go.org/mailman/listinfo/computer-go _______________________________________________ Computer-go mailing list [email protected] http://computer-go.org/mailman/listinfo/computer-go
