Quoting steve uurtamo <[EMAIL PROTECTED]>:

magnus,

I hate to ask the obvious, but how much time does each simulation
take?

If 100 simulations would be enough information to get it out of its
rut, why not just check every legal move for (say) 100 simulations before doing
anything else?
My program is doing perhaps 2000 simulations per second in the endgame.
Your suggestion would kill performance completely, since it would apply to all nodes in the tree. If there are 16 candidates moves in a node it would spend 1600 simulations just to search the root and one would get only 2-3 ply with normal thinking times, this simply almost reverts into simple MC eval without tree search.

This is how my first MC program worked before UCT, it did alpha-beta search evaluating each move with a constant number of moves unless a cut was enabled early.

Here is an example of how wonderfully selective the search is right now:

After 500 simulations it has switched to the correct move 2 which is searched 292 times wheras as move 1 was searched 184 times. With my first fix the switch would take at least 2 seconds.

The program was allowed to search for 15 seconds but stopped after 2.8 seconds because of superiority of the best move. Here is how the search tree looks like.

Root:
Move 1 is searched 6163 times WR=67%, move 2 575 times WR=48%, all 16 moves that were not pruned at move generations they were searched 2-14 times each.

1 ply 3568/1149/785/272/181/156/... and then 1-7 times
2 ply 2432/641/297/59/48/39/16/... and then 1-3 times
3 ply 1016/799/501/98/... and then 1-2 times
4 ply 766/166/25/16/... and then 1-5 times

for each ply I listed the number of simulations spent on all candidate moves that got any attention.

The principal variation is longer than this but my programs only prints those positions that have been searched above some threshold. Also the move ordering of this principal variations is very strong.

As you can see the tree which is something like 16x15x14x13x12x... has been reduced to something like 2x3x3x2 which is higly selective. And the quality of moves are still excellent. This is of course only possible in the endgame, in the opening the width of the search is reduced from about 20-25 candidate moves to 3-7 moves.

I probably still have some strong biases in AMAF in my program, and I guess there will be more positions whith bad behavior but from now on this will do it for me because as long as it does not locks into searching only one move the bias is probably mixed up and cannot become that extreme.

But any ideas on how to remove bad bias from AMAF is wellcome.

The only thing I do right now is that I only use the first 70% of the moves played in the simulations for AMAF, following the idea that moves at the end of each simulations probably is only noise anyway.

-Magnus


--
Magnus Persson
Berlin, Germany
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to