Hi.
-------------- Recursive Simulation Data ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- As anyone hard-data on the performances of a recursive simulator bot ? (What the hell does that mean ? :) ) You would use for example an AMAF strategy to make up a simulation of how the game may go. (rather than doing a very-fast nearly 100% random one). [ My tests shows that even with as few as 10 amaf simulation per move, the win rate against the pure-random strategy is almost 100%. (It's not a question of recursive anything in those results) ] So i really wonder how would behave the engine using this simulation policy : For each move, you select one of the highest AMAF score after 10 simulations. If all simulation get the same result, (black win, or black loose) then the game is over (the winner being who ever is shown by the simulations). Obviously you can use more simulation per move too. This idea is very basic, so i would like to have data for the following cases : - Using AMAF over the "recursive simulation" - Using a well known algorithm (like UCT) with those recursive simulations. The reasons why it may not have been tried to much, is that it would consumes obviously a lot of simulations. (and thus be inefficient) But i think it's really interesting to have those data. (which i probably should be working on myself but well if someone already have done that particular study ....). Montecarlo is well-known to be inefficient anyway :) -------------- Tree decision standard ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Now that we got a "standard light policy", maybe we should get our hand dirty on a standard tree-decision algorithm, based on those "standard light policy". Mostly, this basic algorithm is known to scale well, although inefficient if not using AMAF or RAVE or whatever. Still it's something that almost everybody has done ... so it's standard. I would like also to discuss not-deterministic alternatives to UCT. As it would be more natural to get Multi-threading then. So it's about a "standard tree decision algorithm that is multi-threading friendly". I used a simple stochastic alternative to UCT with good results. + Be nmove the number of legal move from the node. + Be best_nvisit_child the number of visit to the "best" of this node. I used the following strategy : Get a number R between 0 and best_nvisit_child + nmove. If R> best_nvisit_child then explore the best move. Otherwise pick another one at random. I think it's an epsilon-greedy equivalent. I didn't try to tune it. But it's stochastic, and though i have no hard-data to give you, it gave me a fair result. I made a bot that scored 30% win against a version of gnu go lvl 0, using this tree exploration strategy and 1000 light simulations. This bot however had a lot of experimental stuff, with the goal of optimizing the per-light-simulation performances. It was very limited in scaling as the number of simulation increased. (something we easily run onto when messing up with AMAF :) ) cheers. _________________________________________________________________ Téléphonez gratuitement à tous vos proches avec Windows Live Messenger ! Téléchargez-le maintenant ! http://www.windowslive.fr/messenger/1.asp_______________________________________________ computer-go mailing list [email protected] http://www.computer-go.org/mailman/listinfo/computer-go/
