My first montecarloprogram Viking worked something like this (this was
before UCT).
It used alpha-beta evaluating each terminal node with 500 simulations.
But I would do early cutoffs after 50 moves or so if a coinfidence
interval was outside the current alfa or beta.
This together with heavy playout made it work well enough to compete for
some time even with the first UCT programs.
Magnus Persson (Viking/Valkyria)
On 2016-01-11 17:15, Hendrik Baier wrote:
Hi Xavier,
I have considered this, but even level-2 Nested MCTS requires quite a
bit of thinking time. Let's say you can do 1000 playouts per second in
your game, and you want to do at least 500 playouts on levels 1 and 2
for each move decision. Assume your playouts have 50 moves on average.
Then we're talking about 3.5 hours per move already...
The numbers are of course all made up but you get the point. Imagine
spending only 10% of that on each move, and then try to play a couple
thousand games for parameter tuning... :) In one-player games this is
much less of a problem because you can do just one search from the
initial position.
If you simply keep your parameter values from lower time settings
however, or if you somehow extrapolate them to such long thinking
times, NMCTS could be an interesting approach for a very slow
two-player game tournament where only one or a few moves per day are
needed. You could possibly draw some qualitative conclusions from it,
even without quantitatively estimating the playing strength very well.
Cheers,
Hendrik
Thanks, it was nice.
I have a question did you tried to implement Nested montecarlo tree
search
in two player game ?
If I remembered well something like this was envisioned in this
mailing
list.
_______________________________________________
Computer-go mailing list
[email protected]
http://computer-go.org/mailman/listinfo/computer-go
_______________________________________________
Computer-go mailing list
[email protected]
http://computer-go.org/mailman/listinfo/computer-go