I implemented UCB1-tuned in my basic UCB-1 go player, but it doesn't seem like it makes a difference in self-play.

It seems like it's able to run 5-25% more simulations, which means it's probably exploiting deeper (and has less steps until it runs out of room to play legal moves), but I have yet to see any strength improvements on 9x9 boards.

As far as I understand, the only thing that's different is the formula. Has anyone actually seen any difference between the two algorithms?
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to