The text in the appendix has the answer, in a paragraph titled "Expand and evaluate (Fig. 2b)": "[...] The leaf node is expanded and and each edge (s_t, a) is initialized to {N(s_t, a) = 0, W(s_t, a) = 0, Q(s_t, a) = 0, P(s_t, a) = p_a}; [...]"
On Sun, Dec 3, 2017 at 11:27 AM, Andy <andy.olsen...@gmail.com> wrote: > Figure 2a shows two bolded Q+U max values. The second one is going to a > leaf that doesn't exist yet, i.e. not expanded yet. Where do they get that > Q value from? > > The associated text doesn't clarify the situation: "Figure 2: Monte-Carlo > tree search in AlphaGo Zero. a Each simulation traverses the tree by > selecting the edge with maximum action-value Q, plus an upper confidence > bound U that depends on a stored prior probability P and visit count N for > that edge (which is incremented once traversed). b The leaf node is > expanded..." > > > > > > > 2017-12-03 9:44 GMT-06:00 Álvaro Begué <alvaro.be...@gmail.com>: > >> I am not sure where in the paper you think they use Q(s,a) for a node s >> that hasn't been expanded yet. Q(s,a) is a property of an edge of the >> graph. At a leaf they only use the `value' output of the neural network. >> >> If this doesn't match your understanding of the paper, please point to >> the specific paragraph that you are having trouble with. >> >> Álvaro. >> >> >> >> On Sun, Dec 3, 2017 at 9:53 AM, Andy <andy.olsen...@gmail.com> wrote: >> >>> I don't see the AGZ paper explain what the mean action-value Q(s,a) >>> should be for a node that hasn't been expanded yet. The equation for Q(s,a) >>> has the term 1/N(s,a) in it because it's supposed to average over N(s,a) >>> visits. But in this case N(s,a)=0 so that won't work. >>> >>> Does anyone know how this is supposed to work? Or is it another detail >>> AGZ didn't spell out? >>> >>> >>> >>> _______________________________________________ >>> Computer-go mailing list >>> Computer-go@computer-go.org >>> http://computer-go.org/mailman/listinfo/computer-go >>> >> >> >> _______________________________________________ >> Computer-go mailing list >> Computer-go@computer-go.org >> http://computer-go.org/mailman/listinfo/computer-go >> > > > _______________________________________________ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go >
_______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go