Re: [Computer-go] Question about exploration in CLOP

Rémi Coulom Tue, 15 Nov 2011 09:51:03 -0800

My implementation is very basic (and inefficient). I use Gibbs sampling (ie, 
Metropolis-Hastings, one dimension at a time, which scales better to higher 
dimensions), with uniform samples over the parameter range. Details of the 
implementation are in CSPWeight.cpp. I found it is good enough in practice, but 
I will improve it. It should be easy to use the quadratic regression to define 
a candidate distribution that is much better than uniform.


http://en.wikipedia.org/wiki/Metropolis%E2%80%93Hastings_algorithm
http://en.wikipedia.org/wiki/Gibbs_sampling

Also, an unrelated note about priors: it is a good idea to use a pessimistic 
prior for the mean/lcb estimation, and a more optimistic prior for the 
regression. I did not mention it in the paper. It prevents the algorithm from 
iterating forever until the winning rate is close to 100%. It is not extremely 
critical for performance, but it may help a bit.

Rémi

On 15 nov. 2011, at 17:59, Brian Sheppard wrote:

> I would like to know more about the exploration methods that you tested in
> CLOP. Let's start with Metropolis-Hastings.
> 
> I understand Metropolis-Hastings as having a current point P, which has a
> weight Wp, and randomly sampling a point Q, which has weight Wq. Then your
> next point will be Q if Wq >= Wp, or if Wq < Wp then move to Q with
> probability Wq/Wp, and keep P otherwise. Do I have that right?
> 
> My question concerns the space over which Q is sampled. Is it just random
> over the whole domain? Or a radius around P?
> 
> Thanks,
> Brian
> 
> 
> _______________________________________________
> Computer-go mailing list
> [email protected]
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] Question about exploration in CLOP

Reply via email to