> > How do you think what the program should do if the game is over in the Tree > Policy, not in the Default Policy? Do we have to make the program not to > select this node any more (not to call procedure PlaySimulation for this > node)? > > If your UCT formula is optimized (in terms of constants), It should be cancelled immediately anyway, whenever you don't do so explicitly. But in games in which immediate losses exist you will earn a lot by forcing winning moves, and when no such move exists by forcing moves which forbid a winning move to the opponent: http://hal.inria.fr/inria-00495078/en/ (e.g. for atari-go = ponnuki-go, havannah, hex, breakthrough...)
I think you should have a UCT formula like score = (nb wins + K1) / (nb sims + K2) + K3 sqrt(log(...)/ ......) (I assume below that you have a turn-based game with no stochastic part, not necessarily go, but not backgammon...) where K3=0 if other constants are well optimized and at this point you should not have to implement anything forbidding loosing moves (you can still do it, but the improvement should be very moderate, maybe not worth the effort if checking is expensive...). If you implement RAVE, then it becomes alpha = (nb sims / (K4+nb sims))^K5 score = alpha (nb wins + K1) / (nb sims + K2) + (1-alpha) (ravewins + K1') / (ravesims + K2') yes, so many constants, K1 >0, K2 >0, K1'>0, K2'>0, K4>0, K5 in [0,1] and if you have time enough for implementing database knowledge, you can add something like + H(pattern,situation)/h(nb sims) for H as in CrazyStone papers or Mango papers and h running to infinity. I hope I understood your question correctly and I hope this helps :-) Best regards, Oliver -- ========================================================= Olivier Teytaud -- [email protected] TAO, LRI, UMR 8623(CNRS - Universite Paris-Sud), bat 490 Universite Paris-Sud F-91405 Orsay Cedex France http://0z.fr/EJm0g (one of the 56.5 % of french who did not vote for Sarkozy in 2007)
_______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
