The numbers look pretty impressive! So this DNN is as strong as
a full-fledged MCTS engine with non-trivial thinking time. The increased
supervision is a nice idea, but even barring that this seems like quite
a boost to the previously published results?  Surprising that this is
just thanks to relatively simple tweaks to representations and removing
features... (Or is there anything important I missed?)

I'm not sure what's the implementation difference between darkfores1 and
darkfores2, it's a bit light on detail given how huge the winrate delta
is, isn't it? ("we fine-tuned the learning rate")  Hopefully peer review
will help.

Do I understand it right that in the tree, they sort moves by their
probability estimate, keep only moves whose probability sum up to
0.8, prune the rest and use just plain UCT with no priors afterwards?
The result with +MCTS isn't at all convincing - it just shows that
MCTS helps strength, which isn't so surprising, but the extra thinking
time spent corresponds to about 10k->150k playouts increase in Pachi,
which may not be a good trade for +27/4.5/1.2% winrate increase.

On Mon, Nov 23, 2015 at 09:54:37AM +0100, Rémi Coulom wrote:
> It is darkforest, indeed:
> Title: Better Computer Go Player with Neural Network and Long-term
> Prediction
> Authors: Yuandong Tian, Yan Zhu
> Abstract:
> Competing with top human players in the ancient game of Go has been a
> long-term goal of artificial intelligence. Go's high branching factor makes
> traditional search techniques ineffective, even on leading-edge hardware,
> and Go's evaluation function could change drastically with one stone change.
> Recent works [Maddison et al. (2015); Clark & Storkey (2015)] show that
> search is not strictly necessary for machine Go players. A pure
> pattern-matching approach, based on a Deep Convolutional Neural Network
> (DCNN) that predicts the next move, can perform as well as Monte Carlo Tree
> Search (MCTS)-based open source Go engines such as Pachi [Baudis & Gailly
> (2012)] if its search budget is limited. We extend this idea in our bot
> named darkforest, which relies on a DCNN designed for long-term predictions.
> Darkforest substantially improves the win rate for pattern-matching
> approaches against MCTS-based approaches, even with looser search budgets.
> Against human players, darkforest achieves a stable 1d-2d level on KGS Go
> Server, estimated from free games against human players. This substantially
> improves the estimated rankings reported in Clark & Storkey (2015), where
> DCNN-based bots are estimated at 4k-5k level based on performance against
> other machine players. Adding MCTS to darkforest creates a much stronger
> player: with only 1000 rollouts, darkforest+MCTS beats pure darkforest 90%
> of the time; with 5000 rollouts, our best model plus MCTS beats Pachi with
> 10,000 rollouts 95.5% of the time.

                                Petr Baudis
        If you have good ideas, good data and fast computers,
        you can do almost anything. -- Geoffrey Hinton
Computer-go mailing list

Reply via email to