RE: [SPAM] Re: [SPAM] Re: [computer-go] Progressive widening vs unpruning

2009-10-05 Thread David Fotland
I tried this yesterday with K=10 and it seemed to make Many Faces weaker
(84.2% +- 2.3 vs 81.6% +-1.7), not 95% confidence, but likely weaker.  This
is 19x19 vs gnugo with Many Faces using 8K playouts per move, 1000 games
without and 2000 games with the change.  I have the UCT exploration term, so
perhaps with exploration this idea doesn't work.  Or perhaps the K I tried
is too large.

 

David

 

From: computer-go-boun...@computer-go.org
[mailto:computer-go-boun...@computer-go.org] On Behalf Of Olivier Teytaud
Sent: Saturday, October 03, 2009 1:28 AM
To: computer-go
Subject: Re: [SPAM] Re: [SPAM] Re: [computer-go] Progressive widening vs
unpruning

 

 


4) regularized success rate (nbWins +K ) /(nbSims + 2K)
(the original progressive bias is simpler than that)

 

I'm not sure what you mean here. Can you explain a bit more?

 


Sorry for being unclear, I hope I'll do better below.

Instead of just number of wins divided by numer of simulations,
we use nb of wins + K divided by nb of simulations + 2K;
this is similar to the even game heuristic previously cited;
it avoids that we 0% of success rate for a move tested just once.

If you apply UCT with constant zero in front of the sqrt{log(N)/N_i)
term, then such a regularization is necessary for showing consistency of UCT
for two-player games; and even with non-zero exploration terms, I guess
this kind of regularization avoids that the program spends a very long time
without looking at a move just because of a few bad first simulations. This
kind of detail is a bit boring, but I think K0 is much better in many
cases... well, maybe not for other implementations, depending on the other
terms you have - our formula is so long now I'm not able of writing it in
closed form :-)
By the way, K0 is in my humble opinion a very good idea if you want to
check that UCT with positive constant has a good effect in your code - I
feel that UCT is great if K=0, just because of the bad first simulation
effect - with K=0 and without exploration term, just loosing the first few
simulations can lead to the very bad situation in which a move is never
tested anymore.

Best regards,
Olivier

 

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [SPAM] Re: [computer-go] Progressive widening vs unpruning

2009-10-02 Thread Olivier Teytaud

 What's your general approach?  My understanding from your previous posts
 is
 that it's something like:

 Your understanding is right.


By the way, all the current strong programs are really very similar...
Perhaps Fuego has something different in 19x19 (no big database of patterns
?). I'm not at all sure of this conjecture on Fuego.

Best regards,
Olivier
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [SPAM] Re: [computer-go] Progressive widening vs unpruning

2009-10-02 Thread Jason House
On Oct 2, 2009, at 2:24 PM, Olivier Teytaud olivier.teyt...@lri.fr  
wrote:




4) regularized success rate (nbWins +K ) /(nbSims + 2K)
(the original progressive bias is simpler than that)


I'm not sure what you mean here. Can you explain a bit more?
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


RE: [SPAM] Re: [computer-go] Progressive widening vs unpruning

2009-09-30 Thread David Fotland
Look for the graph I posted a few weeks ago.  Most things tried make it
worse.  Some make it a little better, and every now and then there is a big
jump.

David

 I'm wondering, are these tunings about squeezing single-percent
 increases with very narrow confidence bounds, or something that gives
 immediately noticeable 10% boost when applied? I'm curious about how the
 top bots improve, if it's accumulating many tiny increases or long
 quests for sudden boosts.
 
 --
   Petr Pasky Baudis
 A lot of people have my books on their bookshelves.
 That's the problem, they need to read them. -- Don Knuth
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [SPAM] Re: [computer-go] Progressive widening vs unpruning

2009-09-29 Thread Olivier Teytaud
I guess I'm not really appreciating the difference between node value
 prior and progressive bias - adding a fixed small number of wins or
 diminishing heuristic value seems very similar to me in practice. Is the
 difference noticeable?


It just means that the weight of the prior does not necessarily decrease
linearly. For example, for Rave, you might prefer to have a weight which
depends on both the number of Rave simulations and on the number of real
simulations.

The formula designed by Sylvain for Rave, for example, can't be recovered
without this generalization. For the current mogo, as we have several terms
(one for the empirical success rate, one for Rave, one for patterns, one for
expert rules...), some of them with used at different places in the code
(there is a prior expressed in terms of rave simulations, and a prior
expressed as real simulations), it would be complicated to come back to
something much simpler ... but perhaps it's possible. Such a clarification
might be useful, I have no clear idea of the impact of Rave values now in
mogo, in particular for long time settings, and it's not so easy to clarify
this point - too many dependencies between the many terms we have.

I think someone pointed out a long time ago on this mailing list that
initializing the prior in terms of Rave simulations was far less efficient
than initializing the prior in terms of real simulations, at least if you
have classical rave formulas - at least, we had an improvement when adding
a prior in the real simulations, but we had also an improvement when
adding one more term, which is not linear. Sorry for forgetting who :-(
Olivier
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

RE: [SPAM] Re: [computer-go] Progressive widening vs unpruning

2009-09-29 Thread David Fotland
Are you sure you are not over-tuning to some opponent, or to self-play?  I
have a very simple formula, win rate plus rave (with the simpler beta
formula), plus the simple upper bound term.  I bias the rave simulations
only.  It seems to work pretty well.  Your description sounds pretty
complex.

 

David

 

From: computer-go-boun...@computer-go.org
[mailto:computer-go-boun...@computer-go.org] On Behalf Of Olivier Teytaud
Sent: Tuesday, September 29, 2009 1:26 PM
To: computer-go
Subject: Re: [SPAM] Re: [computer-go] Progressive widening vs unpruning

 

 

I guess I'm not really appreciating the difference between node value
prior and progressive bias - adding a fixed small number of wins or
diminishing heuristic value seems very similar to me in practice. Is the
difference noticeable?


It just means that the weight of the prior does not necessarily decrease
linearly. For example, for Rave, you might prefer to have a weight which
depends on both the number of Rave simulations and on the number of real
simulations. 

The formula designed by Sylvain for Rave, for example, can't be recovered
without this generalization. For the current mogo, as we have several terms
(one for the empirical success rate, one for Rave, one for patterns, one for
expert rules...), some of them with used at different places in the code
(there is a prior expressed in terms of rave simulations, and a prior
expressed as real simulations), it would be complicated to come back to
something much simpler ... but perhaps it's possible. Such a clarification
might be useful, I have no clear idea of the impact of Rave values now in
mogo, in particular for long time settings, and it's not so easy to clarify
this point - too many dependencies between the many terms we have.  

I think someone pointed out a long time ago on this mailing list that
initializing the prior in terms of Rave simulations was far less efficient
than initializing the prior in terms of real simulations, at least if you
have classical rave formulas - at least, we had an improvement when adding
a prior in the real simulations, but we had also an improvement when
adding one more term, which is not linear. Sorry for forgetting who :-(
Olivier



 

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/