Re: [computer-go] Request for : Recursive Simulation Data + Tree decision standard

Don Dailey Wed, 22 Oct 2008 05:09:00 -0700

On Wed, 2008-10-22 at 11:26 +0200, Denis fidaali wrote:
> Hi.
> 
> 
> --------------
>  Recursive Simulation Data
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> As anyone hard-data on the performances of a recursive simulator bot ?
> (What the hell does that mean ? :) )
> 
> You would use for example an AMAF strategy to make up a simulation of how the 
> game may go.
> (rather than doing a very-fast nearly 100% random one).


That is one of the very first things I tried.  It's a logical thing to
do.     And yes, even 1 or 2 playouts beats random.  

In my implementation you could set the number of "internal" simulations
so I think I tried something like 10.   Each playout was generated by 10
playout bots.    In theory you could go deeper than that of course.

My results however was disappointing.  I had some trouble debugging this
because the program wasn't designed for this and I tried to hack it up
quickly but I  wondered if perhaps  I had not implemented it correctly,
but what I found was that even with the same number of high level
simulations (taking a huge slowdown of course) it played worse.  

I doesn't make sense to me that it should play worse but I eventually
gave up on the idea.   I figured that even if I could make it play
better at the same time controls this method has definite limitations
because it will always be subject to a level of shortsightedness unless
a real tree was constructed.

- Don



> 
> [
> My tests shows that even with as few as 10 amaf simulation per move, the win 
> rate against
> the pure-random strategy is almost 100%. (It's not a question of recursive 
> anything in those results)
> ]
> 
>  So i really wonder how would behave the engine using this simulation policy :
>  For each move, you select one of the highest AMAF score after 10 
> simulations. If all simulation get the same result,
> (black win, or black loose) then the game is over (the winner being who ever 
> is shown by the simulations).
> Obviously you can use more simulation per move too.
> 
>  This idea is very basic, so i would like to have data for the following 
> cases :
> 
>  - Using AMAF over the "recursive simulation"
>  - Using a well known algorithm (like UCT) with those recursive simulations.
> 
> The reasons why it may not have been tried to much, is that it would consumes 
> obviously a lot of simulations. (and thus be inefficient)
> But i think it's really interesting to have those data. (which i probably 
> should be working on myself but well if someone
> already have done that particular study ....). Montecarlo is well-known to be 
> inefficient anyway :)
> 
> --------------
> Tree decision standard
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 
> Now that we got a "standard light policy", maybe we should get our hand dirty 
> on a standard tree-decision algorithm,
> based on those "standard light policy". Mostly, this basic algorithm is known 
> to scale well, although inefficient if
> not using AMAF or RAVE or whatever. Still it's something that almost 
> everybody has done ... so it's standard.
> 
> I would like also to discuss not-deterministic alternatives to UCT. As it 
> would be more natural to get Multi-threading then.
> So it's about a "standard tree decision algorithm that is multi-threading 
> friendly".
> 
> 
> I used a simple stochastic alternative to UCT with good results.
> + Be nmove the number of legal move from the node.
> + Be best_nvisit_child the number of visit to the "best" of this node.
> 
>  I used the following strategy : 
> Get a number R between 0 and best_nvisit_child + nmove. If R> 
> best_nvisit_child then explore the best move. Otherwise pick another one at 
> random.
> I think it's an epsilon-greedy equivalent. I didn't try to tune it. But it's 
> stochastic, and though i have no hard-data to give you, it gave me a fair 
> result.
> 
> I made a bot that scored 30% win against a version of gnu go lvl 0, using 
> this tree exploration strategy and 1000 light simulations. This bot however 
> had a lot of experimental stuff,
> with the goal of optimizing the per-light-simulation performances. It was 
> very limited in scaling as the number of simulation increased. 
> (something we easily run onto when messing up with AMAF :) )
> 
> cheers.
> _________________________________________________________________
> Téléphonez gratuitement à tous vos proches avec Windows Live Messenger  !  
> Téléchargez-le maintenant ! 
> http://www.windowslive.fr/messenger/1.asp_______________________________________________
> computer-go mailing list
> [email protected]
> http://www.computer-go.org/mailman/listinfo/computer-go/

signature.asc
Description: This is a digitally signed message part

_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Request for : Recursive Simulation Data + Tree decision standard

Reply via email to