hi Bo, > Let me know if there is any silly mistakes :)
You say "the perfect policy network can be derived from the perfect value network (the best next move is the move that maximises the value for the player, if the value function is perfect), but not vice versa.", but a perfect policy for both players can be used to generate a perfect playout which yields the perfect value... regards, -John _______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go