Hi, >How do you get the V(s) for those datasets? You play out the endgame >with the Monte Carlo playouts? > >I think one problem with this approach is that errors in the data for >V(s) directly correlate to errors in MC playouts. So a large benefit of >"mixing" the two (otherwise independent) evaluations is lost.
Yes, that is a problem for Human games dataset. On the other hand, currently the SL part is relatively easier (it seems everyone arrives at a 50-60% accuracy), and the main challenges of the RL part is generating the huge number of self-play games. In self-play games we have an accurate end-game v(s) / V(s). And v(s) / V(s) is able to use the information in self-play games more efficiently. I think this can be helpful. > _______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go