Quoting Peter Drake <[EMAIL PROTECTED]>:
On Oct 30, 2006, at 7:51 PM, Don Dailey wrote:
On Mon, 2006-10-30 at 19:34 -0800, Peter Drake wrote:
I'm running into a problem where my Monte Carlo program is very slow
to acknowledge that its favorite move has a strong counter.
Part of the problem is that the value of a move is based on how many
of the runs through that move have succeeded. If there were a lot of
them before the correct reply was discovered, the move has
considerable "inertia" and it takes time to recover.
The inertia effect is minor, once it finds the right move it tends to
quickly recover.
I'm getting a rather severe effect. Specifically, if the move in
question has led to wins for hundreds of thousands of games, it takes
a similar amount of time to get its win rate back down.
Valkyria behaves likes this too but I see it as something natural. It is not
always that a refutation is for real so it need to be conservative. If the
program has searched for 100000 simulations it may still be the best move to
play, becuase it does not yet a safe alternative that is better. If the
program
plays the refuted move it might still be hard for the opponent to find the
refutation. If the opponent does find it easily then the opponent is clearly
stronger anyway...
I think there will always be situations where the search will behave
like this,
and one should perhaps direct attention to the cause of the problem instead:
that is why did the program not find the refutation after 10000 simulations
rather than 100000? With better evaluation accuracy in the pseudorandom
simulations the problems will go away in most positions (works very well with
Valkyria) or one can perhaps bias search in the search algorithm which I have
not tried yet.
I have a position where Valkyria finds the best move after about 10 minutes of
search, refuting 3 moves which it initially finds better, although a human
would probably pick out the best move immediately. But given what Valkyria
knows about go I think this 10 minutes search is perfectly reasonable. It is
after all a quite stupid go program, which corrects itself by playing out
pseudo random games at a very high speed.
-Magnus
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/