Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

Richard Lorentz Wed, 06 Dec 2017 10:46:25 -0800

One chess result stood out for me, namely, just how much easier it wasfor AlphaZero to win with white (25 wins, 25 draws, 0 losses) ratherthan with black (3 wins, 47 draws, 0 losses).


Maybe we should not give up on the idea of White to play and win in chess!


On 12/06/2017 01:24 AM, Hiroshi Yamashita wrote:

Hi,
DeepMind makes strongest Chess and Shogi programs with AlphaGo Zeromethod.
Mastering Chess and Shogi by Self-Play with a General ReinforcementLearning Algorithmhttps://urldefense.proofpoint.com/v2/url?u=https-3A__arxiv.org_pdf_1712.01815.pdf&d=DwIGaQ&c=Oo8bPJf7k7r_cPTz1JF7vEiFxvFRfQtp-j14fFwh71U&r=i0hg-cKH69CA5MsdosvezQ&m=w0qxE9GOfBVzqPOT0NBm1nsdQqJMlNu40BOCWfsO-gQ&s=dsola-9J77ArHVeuVc0ZCZKn2nJOsjfsnJzPc_MdPDo&e=
AlphaZero(Chess) outperformed Stockfish after 4 hours,
AlphaZero(Shogi) outperformed elmo after 2 hours.

Search is MCTS.
AlphaZero(Chess) searches     80,000 positions/sec.
Stockfish        searches 70,000,000 positions/sec.
AlphaZero(Shogi) searches     40,000 positions/sec.
elmo             searches 35,000,000 positions/sec.

Thanks,
Hiroshi Yamashita

_______________________________________________
Computer-go mailing list
[email protected]
https://urldefense.proofpoint.com/v2/url?u=http-3A__computer-2Dgo.org_mailman_listinfo_computer-2Dgo&d=DwIGaQ&c=Oo8bPJf7k7r_cPTz1JF7vEiFxvFRfQtp-j14fFwh71U&r=i0hg-cKH69CA5MsdosvezQ&m=w0qxE9GOfBVzqPOT0NBm1nsdQqJMlNu40BOCWfsO-gQ&s=Dflm7ezefzMJ9xLNmNYrSQKWa7qvG9FkzlCHngo_NcY&e=

_______________________________________________
Computer-go mailing list
[email protected]
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

Reply via email to