On 16/11/2017 16:43, Petr Baudis wrote:
> But now, we expand the nodes literally all the time, breaking the 
> stationarity possibly in drastic ways.  There are no reevaluations
> that would improve your estimate.

First of all, you don't expect the network evaluations to drastically
vary between parent and children, unless there are tactics that you are
not understanding.

Secondly, the evaluations are rather noisy, so averaging still makes sense.

Third, evaluating with a different rotation effectively forms an
ensemble that improves the estimate.

> Therefore, can't we take the next step, and do away with MCTS?  Is 
> there a theoretical viewpoint from which it still makes sense as the
> best policy improvement operator?

People have posted results with that on this list and IIRC programs
using regular alpha-beta were weaker.

As for a theoretical viewpoint: the value net is an estimation of the
value of some fixed amount of Monte Carlo rollouts.

> What would you say is the current state-of-art game tree search for 
> chess?  That's a very unfamiliar world for me, to be honest all I
> really know is MCTS...

The same it was 20 year ago, alpha-beta. Though one could certainly make
the argument that an alpha-beta searcher using late move reductions
(searching everything but the best moves less deeply) is searching a
tree of a very similar shape as an UCT searcher with a small exploration
constant.

-- 
GCP
_______________________________________________
Computer-go mailing list
[email protected]
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to