On Nov 3, 2007 5:25 PM, Benjamin Teuber <[EMAIL PROTECTED]> wrote: > On 11/3/07, Jason House <[EMAIL PROTECTED]> wrote: > > > > > > > On Fri, 2007-11-02 at 22:28 +0100, Benjamin Teuber wrote: > > > I don't think there's something different at different depths in the > > > tree.. > > > To update RAVE after a simulation, for each child of a node you > > > visited during that simulation, you update if the move leading to the > > > child was played later (until the end of the playout). > > > > I start each new simulation at the root of the search tree. That could > > make every node in the tree a child (grandchild, etc...) of a node that > > was visited. While traversing the entire tree to update values could be > > done it seems complex and seems like it may bias results too much. > > > > Do you stop at just the children of nodes that are visited and not > > extend to grandchildren? > > > Sure, I was just referring to direct children. So, for each node n you > visited during this simulation and each move m later played during that > simulation by the player moving in position n, you update the node you would > get to from n by moving at m - if m is legal in n.. >
I implemented this yesterday. In doing so, I realized I didn't know the proper way to initialize new leaves in the UCT tree. MoGo papers seem to talk about a progression from always picking an unexplored leaf (AKA using infinity for the upper confidence bound), to "first play urgency" (using a fixed ucb for new leaves), to using patterns. I don't yet have patterns and am curious what is recommended. If no real sims exist for a child, I use the first play urgency of 110%. If no amaf sims exist for a child, I pick it for immediate simulation. Have any techniques (without patterns) proven more effective?
_______________________________________________ computer-go mailing list [email protected] http://www.computer-go.org/mailman/listinfo/computer-go/
