On Wed, 2008-10-22 at 11:26 +0200, Denis fidaali wrote: > Hi. > > > -------------- > Recursive Simulation Data > ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > As anyone hard-data on the performances of a recursive simulator bot ? > (What the hell does that mean ? :) ) > > You would use for example an AMAF strategy to make up a simulation of how the > game may go. > (rather than doing a very-fast nearly 100% random one).
That is one of the very first things I tried. It's a logical thing to do. And yes, even 1 or 2 playouts beats random. In my implementation you could set the number of "internal" simulations so I think I tried something like 10. Each playout was generated by 10 playout bots. In theory you could go deeper than that of course. My results however was disappointing. I had some trouble debugging this because the program wasn't designed for this and I tried to hack it up quickly but I wondered if perhaps I had not implemented it correctly, but what I found was that even with the same number of high level simulations (taking a huge slowdown of course) it played worse. I doesn't make sense to me that it should play worse but I eventually gave up on the idea. I figured that even if I could make it play better at the same time controls this method has definite limitations because it will always be subject to a level of shortsightedness unless a real tree was constructed. - Don > > [ > My tests shows that even with as few as 10 amaf simulation per move, the win > rate against > the pure-random strategy is almost 100%. (It's not a question of recursive > anything in those results) > ] > > So i really wonder how would behave the engine using this simulation policy : > For each move, you select one of the highest AMAF score after 10 > simulations. If all simulation get the same result, > (black win, or black loose) then the game is over (the winner being who ever > is shown by the simulations). > Obviously you can use more simulation per move too. > > This idea is very basic, so i would like to have data for the following > cases : > > - Using AMAF over the "recursive simulation" > - Using a well known algorithm (like UCT) with those recursive simulations. > > The reasons why it may not have been tried to much, is that it would consumes > obviously a lot of simulations. (and thus be inefficient) > But i think it's really interesting to have those data. (which i probably > should be working on myself but well if someone > already have done that particular study ....). Montecarlo is well-known to be > inefficient anyway :) > > -------------- > Tree decision standard > ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Now that we got a "standard light policy", maybe we should get our hand dirty > on a standard tree-decision algorithm, > based on those "standard light policy". Mostly, this basic algorithm is known > to scale well, although inefficient if > not using AMAF or RAVE or whatever. Still it's something that almost > everybody has done ... so it's standard. > > I would like also to discuss not-deterministic alternatives to UCT. As it > would be more natural to get Multi-threading then. > So it's about a "standard tree decision algorithm that is multi-threading > friendly". > > > I used a simple stochastic alternative to UCT with good results. > + Be nmove the number of legal move from the node. > + Be best_nvisit_child the number of visit to the "best" of this node. > > I used the following strategy : > Get a number R between 0 and best_nvisit_child + nmove. If R> > best_nvisit_child then explore the best move. Otherwise pick another one at > random. > I think it's an epsilon-greedy equivalent. I didn't try to tune it. But it's > stochastic, and though i have no hard-data to give you, it gave me a fair > result. > > I made a bot that scored 30% win against a version of gnu go lvl 0, using > this tree exploration strategy and 1000 light simulations. This bot however > had a lot of experimental stuff, > with the goal of optimizing the per-light-simulation performances. It was > very limited in scaling as the number of simulation increased. > (something we easily run onto when messing up with AMAF :) ) > > cheers. > _________________________________________________________________ > Téléphonez gratuitement à tous vos proches avec Windows Live Messenger ! > Téléchargez-le maintenant ! > http://www.windowslive.fr/messenger/1.asp_______________________________________________ > computer-go mailing list > [email protected] > http://www.computer-go.org/mailman/listinfo/computer-go/
signature.asc
Description: This is a digitally signed message part
_______________________________________________ computer-go mailing list [email protected] http://www.computer-go.org/mailman/listinfo/computer-go/
