Re: [Computer-go] AlphaZero paper difference between 2017 and 2018

Yuji Ichikawa Thu, 04 Apr 2019 18:33:21 -0700

Yamashita san,

About your question, I think that the answer is yes.


AlphaZero Symmetries seems successfully saturated.
That means that 20b neural network with symmetries has a capacity to learn at 
most 21M full games.
If you let the network to learn 21M full games without preprocessing inputs for 
symmetries, the network may over-fit by breaking symmetries since the input 
data for training are too small (1/8).
So they generated more games in exchange for the preprocessing.

I agree with you that they could not remove domain dependent knowledge 
completely.
Thinning out positions of each game for game symmetries may be important.
I have no knowledges about generalization of symmetries. It sounds hard problem 
if you don't preprocess training inputs.
-
ICHIKAWA, Yuji

> 2019/04/04 23:34、Hiroshi Yamashita <y...@bd.mbn.or.jp>のメール:
> 
> Hi Ichikawa san,
> 
> Thank you for nice explanation. I think your guess is maybe right.
> And 2018 nature paper might have no mistake.
> 
> I had checked carefully both Figure 1.
> 
> 1. 2017 reaches AlphaGo Lee in 170,000 step. 2018 reaches in 80,000 step.
> 2. 2017 and 2018 reach "AlphaGo Zero(20 block)" in similar steps.
> 3. Final strength is similar.
> 
> So I had thought "If you use 7 times games record, initial learning speed is 
> fast,
> but final strength is similar.".
> So maybe they want to say "21 million Training Games is enough."
> 
> But it is wrong.
> In Go, if you use all positions from a game, it makes overfitting? And 
> learning will fail?
> Without symmery-augmented, Go can use only 20 positions from a game.
> Chess and Shogi is ok. It looks like domain dependent...
> 
> Thanks,
> Hiroshi Yamashita
> 
>> Go version in AlphaZero 2017 finished the training in 34 hours according to 
>> Table S3.
>> And it looks like AlphaZero Symmetries in AlphaZero 2018 finished the 
>> training in the same time according to Figure S1.
>> So I think that the authors had adopted AlphaZero Symmetries in 2017 paper 
>> by mistake and retried the experiment again in 2018 paper.
>> In order to compensate symmetries with real self-plays, they generated 8 
>> times more games and reduced positions per game to 1/8.
>> It is just my guess^^
> _______________________________________________
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go

_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaZero paper difference between 2017 and 2018

Reply via email to