Hideki,

This is a very nice observation.

s.


On Nov 16, 2017 12:37 PM, "Hideki Kato" <hideki_ka...@ybb.ne.jp> wrote:

Hi,

I strongly believe adding rollout makes Zero stronger.
They removed rollout just to say "no human knowledge".
#Though the number of past moves (16) has been tuned by
human :).

Hideki

Petr Baudis: <20171116154309.tfq5ix2hzwzci...@machine.or.cz>:
>  Hi,
>
>  when explaining AlphaGo Zero to a machine learning audience yesterday
>
>
>(https://docs.google.com/presentation/d/1VIueYgFciGr9pxiGmoQyUQ088Ca4o
uvEFDPoWpRO4oQ/view)
>
>it occurred to me that using MCTS in this setup is actually such
>a kludge!
>
>  Originally, we used MCTS because with the repeated simulations,
>we would be improving the accuracy of the arm reward estimates.  MCTS
>policies assume stationary distributions, which is violated every time
>we expand the tree, but it's an okay tradeoff if all you feed into the
>tree are rewards in the form of just Bernoulli trials.  Moreover, you
>could argue evaluations are somewhat monotonic with increasing node
>depths as you are basically just fixing a growing prefix of the MC
>simulation.
>
>  But now, we expand the nodes literally all the time, breaking the
>stationarity possibly in drastic ways.  There are no reevaluations that
>would improve your estimate.  The input isn't binary but an estimate in
>a continuous space.  Suddenly the Multi-armed Bandit analogy loses a lot
>of ground.
>
>  Therefore, can't we take the next step, and do away with MCTS?  Is
>there a theoretical viewpoint from which it still makes sense as the best
>policy improvement operator?
>
>  What would you say is the current state-of-art game tree search for
>chess?  That's a very unfamiliar world for me, to be honest all I really
>know is MCTS...
>
>--
>                                       Petr Baudis, Rossum
>       Run before you walk! Fly before you crawl! Keep moving forward!
>       If we fail, I'd rather fail really hugely.  -- Moist von Lipwig
>_______________________________________________
>Computer-go mailing list
>Computer-go@computer-go.org
>http://computer-go.org/mailman/listinfo/computer-go
--
Hideki Kato <mailto:hideki_ka...@ybb.ne.jp>
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to