Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

Brian Sheppard via Computer-go Thu, 07 Dec 2017 19:52:34 -0800

AZ scalability looks good in that diagram, and it is certainly a good start, 
but it only goes out through 10 sec/move. Also, if the hardware is 7x better 
for AZ than SF, then should we elongate the curve for AZ by 7x? Or compress the 
curve for SF by 7x? Or some combination? Or take the data at face value?


I just noticed that AZ has some losses when the opening was forced into 
specific variations as in Table 2. So we know that AZ is not perfect, but 19 
losses in 1200 games is hard to extrapolate. (Curious: SF was a net winner over 
AZ with White in a B40 Sicilian, the only position/color combination out of 24 
in which SF had an edge.)

-----Original Message-----
From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of 
Rémi Coulom
Sent: Thursday, December 7, 2017 11:51 AM
To: computer-go@computer-go.org
Subject: Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a 
General Reinforcement Learning Algorithm

>My concern about many of these points of comparison is that they presume how 
>AZ scales. In the absence of data, I would guess that AZ gains much less from 
>hardware than SF. I am basing this guess on >two known facts. First is that AZ 
>did not lose a game, so the upper bound on its strength is perfection. Second, 
>AZ is a knowledge intensive program, so it is counting on judgement to a 
>larger degree.

Doesn't Figure 2 in the paper indicate convincingly that AZ scales better than 
Stockfish?

Rémi
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

Reply via email to