Re: [computer-go] How to "properly" implement RAVE?

Sylvain Gelly Fri, 30 Jan 2009 00:57:42 -0800

On Fri, Jan 30, 2009 at 9:29 AM, Magnus Persson <magnus.pers...@phmp.se>wrote:


> Quoting Sylvain Gelly <sylvain.ge...@m4x.org>:
>
>  On Wed, Jan 28, 2009 at 10:01 PM, Isaac Deutsch <i...@gmx.ch> wrote:
>>
>>  And a final question: You calculate the (beta) coefficient as
>>> c = rc / (rc+c+rc*c*BIAS);
>>> which looks similar to the formula proposed by David Silver (If I recall
>>> his name correctly). However, in his formula, the last term looks like
>>> rc*c*BIAS/(q_ur*(1-q_ur))
>>> Is it correct that we could get q_ur from the current UCT-RAVE mean
>>> value,
>>> and that it is used like that?
>>>
>>
>>
>> Yes the formula looks very similar (David proposed that formula to me in
>> the
>> beginning of 2007). However my implementation did not contain
>> the (q_ur*(1-q_ur) factor, that I approximated by a constant, taking q=0.5
>> so the factor=0.25.
>> I did not try the other formula, maybe it works better in practice, while
>> I
>> would expect it is similar in practice.
>>
>
> Valkyria uses an even more complicated version of what David Silver
> proposed (I really did not understand it so I came up with something that
> looked plausible to me that actually estimated the bias for each candidate
> move rather than assuming it constant).
>
> When Sylvain proposed this simple version I tested that version against my
> own interpretation. On 9x9 my complicated version might have a win rate 3%
> better than the simple version for 3 data points (700 games each) near the
> maximum. The standard error according to twogtp is 1.9.
>
> On 19x19 I got results where there no difference at all but with much
> higher uncertainty because there was not many games played.


Did you try to tune the bias constant at all or just took the one I posted?
I wrote it from memory and I believe it is in the range of possibly good
values, but it is certainly not optimal, I can be off by a significant
factor in both directions.


> But the term is important for sure, the constant BIAS used should be larger
> than 0 but you should be careful to not set it too high. For Valkyria the
> 0.015 value Sylvain posted here worked fine. But if it is higher for example
> 0.15 leads to bad performance on 9x9 and catastrophic performance on 19x19.
> Initially I thought this parameter should be something like 1 and that was
> completely wrong. I was just lucky to get it right, just by visual
> inspection of the search behavior when I played around with the parameter.


The bias can't possibly be 1 because by definition it is the difference
between the expected value of the normal tree search and the expected value
of AMAF. As those two values are in [0,1] anyway, the bias is certainly not
1, and in most cases very small.

>
>
> The reason is that the bias term of the denominator should be close to zero
> initially is to allow AMAF to have strong impact on the search but at some
> point (which is delayed by having a small BIAS constant) there is a quick
> shift towards using the real win rate instead.
>
> -Magnus
>
>
>
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] How to "properly" implement RAVE?

Reply via email to