On 16/11/2017 16:43, Petr Baudis wrote: > But now, we expand the nodes literally all the time, breaking the > stationarity possibly in drastic ways. There are no reevaluations > that would improve your estimate.
First of all, you don't expect the network evaluations to drastically vary between parent and children, unless there are tactics that you are not understanding. Secondly, the evaluations are rather noisy, so averaging still makes sense. Third, evaluating with a different rotation effectively forms an ensemble that improves the estimate. > Therefore, can't we take the next step, and do away with MCTS? Is > there a theoretical viewpoint from which it still makes sense as the > best policy improvement operator? People have posted results with that on this list and IIRC programs using regular alpha-beta were weaker. As for a theoretical viewpoint: the value net is an estimation of the value of some fixed amount of Monte Carlo rollouts. > What would you say is the current state-of-art game tree search for > chess? That's a very unfamiliar world for me, to be honest all I > really know is MCTS... The same it was 20 year ago, alpha-beta. Though one could certainly make the argument that an alpha-beta searcher using late move reductions (searching everything but the best moves less deeply) is searching a tree of a very similar shape as an UCT searcher with a small exploration constant. -- GCP _______________________________________________ Computer-go mailing list [email protected] http://computer-go.org/mailman/listinfo/computer-go
