Probably true, but I am already running into RAM
limits with big_Mogo18 - had to halve the number of
instances of the autotest program, and am installing
RAM in the next few days to alleviate this problem.
There is also the time-per-game, which will
approximately double.
I'd vote for moving on to
On Feb 8, 2008 12:09 PM, David Silver [EMAIL PROTECTED] wrote:
I think it is time to share this idea with the world :-)
The idea is to estimate bias and variance to calculate the best
combination of UCT and RAVE values.
I have attached a pdf explaining the new formula.
Thanks!
The original
On Fri, 2008-02-08 at 16:39 -0700, David Silver wrote:
2. No, the assumption itself is not correct. The true value of a node
in the tree is 0 or 1, given perfect play. So the UCT value (which
just averages the outcomes of simulations) is significantly biased.
Who can predict perfect play?
David Silver wrote:
BTW if anyone just wants the formula, and doesn't care about the
derivation - then just use equations 11-14.
Yes, I just want to use the formula.
But I don't know what the bias is...
How can I get the value of br?
By the way I currently use this formula.
beta = 1 - log(m)
Why are m and n different? Isn't every playout used both to update the UCT
win rate and the RAVE values for the same nodes? Won't the number of UCT
simulations and the number of RAVE simulations be the same?
Davdi
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of David