Hi!
On Wed, Oct 27, 2010 at 11:42:56AM +0800, Aja wrote:
> > I do not use that either. I just use H() to initialize R or x
> >initially. (It seems to make pretty much no difference for Pachi which
> >one is initialized.)
>
> Looks like you are not using progressive bias. Are you using
> progressive widening combined with RAVE?
I'm sorry, I never really got skilled in this particular terminology.
I'm using what's described in "UCT with Prior Knowledge" section of the
"Combining Online and Offline Knowledge in UCT" - Q-value function
(producing winrate-like number) and eqex equivalent experience (number
of "virtual simulations") for various heuristics. The eqex is set to 20
on 19x19 and 14 on 9x9 for most heuristics.
I suppose it could be reformulated in terms of progressive bias like
eqex/n * Q
term in node evaluation, but to me it's much more natural to just say
that the node's winrate has been initialized by (Q, eqex).
I'm certainly not using progressive unpruning (evaluating only first
f(n) children during move selection). I think it was never used anywhere
besides Mango and CrazyStone?
> > But I do use the beta coefficient as described in the thesis - nice,
> >I could not find any publication with this formula! Using that one made
> >huge difference to me compared to the original beta formula.
>
> For computing beta, I am using David Silver's new RAVE formula that
> Sylvain posted here in the past. Maybe you got better result over
> David's formula?
I use that too (originally, I picked it from Fuego), but isn't that
exactly what's described on that page in the thesis?
--
Petr "Pasky" Baudis
The true meaning of life is to plant a tree under whose shade
you will never sit.
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go