hi Bo,

> Let me know if there is any silly mistakes :)

You say "the perfect policy network can be
derived from the perfect value network (the best next move is the move
that maximises the value for the player, if the value function is
perfect), but not vice versa.", but a perfect policy for both players
can be used to generate a perfect playout which yields the perfect
value...

regards,
-John
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to