Re: [Computer-go] UCB-1 tuned policy

David Fotland Wed, 15 Apr 2015 22:38:13 -0700

I didn’t notice a difference.  Like everyone else, once I had RAVE implemented 
and added biases to the tree move selection, I found the UCT term made the 
program weaker, so I removed it.


David

> -----Original Message-----
> From: Computer-go [mailto:[email protected]] On Behalf Of
> Igor Polyakov
> Sent: Tuesday, April 14, 2015 3:37 AM
> To: [email protected]
> Subject: [Computer-go] UCB-1 tuned policy
> 
> I implemented UCB1-tuned in my basic UCB-1 go player, but it doesn't seem
> like it makes a difference in self-play.
> 
> It seems like it's able to run 5-25% more simulations, which means it's
> probably exploiting deeper (and has less steps until it runs out of room to
> play legal moves), but I have yet to see any strength improvements on
> 9x9 boards.
> 
> As far as I understand, the only thing that's different is the formula.
> Has anyone actually seen any difference between the two algorithms?
> _______________________________________________
> Computer-go mailing list
> [email protected]
> http://computer-go.org/mailman/listinfo/computer-go

_______________________________________________
Computer-go mailing list
[email protected]
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] UCB-1 tuned policy

Reply via email to