Re: [computer-go] analysis of UCT and other bandit algorithms for tree search

Jason House Wed, 25 Apr 2007 07:01:47 -0700

On 4/25/07, Remi Munos <[EMAIL PROTECTED]> wrote:


> 3. The goal is to find the optimal leaf rather than the optimal subtree.

I'm not sure to understand this comment. Actually the goal is to find a
close-to optimal play at the root level. Thus finding a close-to-optimal
min-max leaf achieves that purpose.


I think that was based off of two things...
a) The analysis of max time to "find the 1", and
b) The pure bandit problem applied directly to the leaves of the tree.

When I did sit down and read the paper for real, I saw that both of those
things were just building up the support for the BAST algorithm.  (a) is to
justify accepting a higher regret (in exchange for better worst case
performance).  (b) is showing the non-UCT extreme of the BAST algorithm
(L=infinity).  (If I understand correctly, L=0 is UCT)

PS: I was surprised that the bar chart near the end had L=infinity, L=20,
and L=7, but not L=0 plotted on it.

_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] analysis of UCT and other bandit algorithms for tree search

Reply via email to