Hi.

--------------
 Recursive Simulation Data
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
As anyone hard-data on the performances of a recursive simulator bot ?
(What the hell does that mean ? :) )

You would use for example an AMAF strategy to make up a simulation of how the 
game may go.
(rather than doing a very-fast nearly 100% random one).

[
My tests shows that even with as few as 10 amaf simulation per move, the win 
rate against
the pure-random strategy is almost 100%. (It's not a question of recursive 
anything in those results)
]

 So i really wonder how would behave the engine using this simulation policy :
 For each move, you select one of the highest AMAF score after 10 simulations. 
If all simulation get the same result,
(black win, or black loose) then the game is over (the winner being who ever is 
shown by the simulations).
Obviously you can use more simulation per move too.

 This idea is very basic, so i would like to have data for the following cases :

 - Using AMAF over the "recursive simulation"
 - Using a well known algorithm (like UCT) with those recursive simulations.

The reasons why it may not have been tried to much, is that it would consumes 
obviously a lot of simulations. (and thus be inefficient)
But i think it's really interesting to have those data. (which i probably 
should be working on myself but well if someone
already have done that particular study ....). Montecarlo is well-known to be 
inefficient anyway :)

--------------
Tree decision standard
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Now that we got a "standard light policy", maybe we should get our hand dirty 
on a standard tree-decision algorithm,
based on those "standard light policy". Mostly, this basic algorithm is known 
to scale well, although inefficient if
not using AMAF or RAVE or whatever. Still it's something that almost everybody 
has done ... so it's standard.

I would like also to discuss not-deterministic alternatives to UCT. As it would 
be more natural to get Multi-threading then.
So it's about a "standard tree decision algorithm that is multi-threading 
friendly".


I used a simple stochastic alternative to UCT with good results.
+ Be nmove the number of legal move from the node.
+ Be best_nvisit_child the number of visit to the "best" of this node.

 I used the following strategy : 
Get a number R between 0 and best_nvisit_child + nmove. If R> best_nvisit_child 
then explore the best move. Otherwise pick another one at random.
I think it's an epsilon-greedy equivalent. I didn't try to tune it. But it's 
stochastic, and though i have no hard-data to give you, it gave me a fair 
result.

I made a bot that scored 30% win against a version of gnu go lvl 0, using this 
tree exploration strategy and 1000 light simulations. This bot however had a 
lot of experimental stuff,
with the goal of optimizing the per-light-simulation performances. It was very 
limited in scaling as the number of simulation increased. 
(something we easily run onto when messing up with AMAF :) )

cheers.
_________________________________________________________________
Téléphonez gratuitement à tous vos proches avec Windows Live Messenger  !  
Téléchargez-le maintenant ! 
http://www.windowslive.fr/messenger/1.asp_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to