Re: [Computer-go] mit-develops-algorithm-to-accelerate-neural-networks-by-200x

2019-03-24 Thread Brian Lee
this doesn't actually speed up the neural networks that much; it's a
technique to more quickly brute-force the search space of possible neural
networks for ones that execute faster while maintaining similar accuracy.
Typical hype article.

Anyway, the effort spent looking for bizarre architectures is probably
better spent doing more iterations of zero-style self-play with the same
architecture, since it seems likely we haven't maxed out the strength of
our existing architectures.

On Sun, Mar 24, 2019 at 6:29 PM Ray Tayek  wrote:

>
> https://www.extremetech.com/computing/288152-mit-develops-algorithm-to-accelerate-neural-networks-by-200x
>
> i wonder how much this would speed up go programs?
>
> thanks
>
> --
> Honesty is a very expensive gift. So, don't expect it from cheap people -
> Warren Buffett
> http://tayek.com/
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

[Computer-go] mit-develops-algorithm-to-accelerate-neural-networks-by-200x

2019-03-24 Thread Ray Tayek

https://www.extremetech.com/computing/288152-mit-develops-algorithm-to-accelerate-neural-networks-by-200x

i wonder how much this would speed up go programs?

thanks

--
Honesty is a very expensive gift. So, don't expect it from cheap people - 
Warren Buffett
http://tayek.com/

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Hyper-Parameter Sweep on AlphaZero General

2019-03-24 Thread David Wu
Thanks for sharing the link. Taking a brief look at this paper, I'm quite
confused about their methodology and their interpretation of their data.

For example in figure 2 (b), if I understand correctly, they plot Elo
ratings for three independent runs where they run the entire AlphaZero
process for 50,100,150 iterations, where they seem to be using the word
"iteration" to mean an entire block of playing a fixed number of games
("episodes"), training the neural net, and then testing the net to see if
it should replace the previous.

Their plot of Elo ratings however shows the 50 iterations run starting much
higher and ending much lower than the 100, which starts much higher and
ends much lower than the 150. What stands out is that each of the three
independently seems to be mean 0. Does this mean that for every run, they
only computed Elos using games between nets within that run itself, with no
games comparing nets across separate runs? If so, this makes every Elo
graph in the paper tricky to interpret, since none of the values in any of
them are directly comparable between lines. The ones that span a wider
range are likely to be better runs (more Elo improvement within that run),
but since nontransitivity effects can sometimes lead to some dilations or
contractions of the apparent Elo gain versus the "true" gain against more
general opponents, without cross-run games it's hard to be entirely
confident about the comparisons.

They also seem to imply in the text that the bump in training loss near the
end of the 150 iteration run in Figure 2 (a) indicates that the neural net
worsened, and that more iterations may make the bot worse. This seems to me
a strange conclusion. Their own graph shows that actually the relative Elo
strength within that run almost monotonely increased through that whole
period. Since the AlphaZero process trains towards a moving target, it's
easy for the loss to increase simply due to the data getting harder even if
the neural net always improves - for example maybe the most common opening
changes from a simple one to one that leads to games that are complex and
harder to predict, even if the neural net improves its strength and
accuracy in both openings the whole time.


On Sun, Mar 24, 2019 at 10:05 AM Rémi Coulom  wrote:

> Hi,
>
> Here is a paper you might be interested in:
>
> Abstract:
>
> Since AlphaGo and AlphaGo Zero have achieved breakground successes in the
> game of Go, the programs have been generalized to solve other tasks.
> Subsequently, AlphaZero was developed to play Go, Chess and Shogi. In the
> literature, the algorithms are explained well. However, AlphaZero contains
> many parameters, and for neither AlphaGo, AlphaGo Zero nor AlphaZero, there
> is sufficient discussion about how to set parameter values in these
> algorithms. Therefore, in this paper, we choose 12 parameters in AlphaZero
> and evaluate how these parameters contribute to training. We focus on three
> objectives~(training loss, time cost and playing strength). For each
> parameter, we train 3 models using 3 different values~(minimum value,
> default value, maximum value). We use the game of play 6×6 Othello, on the
> AlphaZeroGeneral open source re-implementation of AlphaZero. Overall,
> experimental results show that different values can lead to different
> training results, proving the importance of such a parameter sweep. We
> categorize these 12 parameters into time-sensitive parameters and
> time-friendly parameters. Moreover, through multi-objective analysis, this
> paper provides an insightful basis for further hyper-parameter optimization.
>
> https://arxiv.org/abs/1903.08129
>
> Rémi
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

[Computer-go] Accelerating Self-Play Learning in Go

2019-03-24 Thread Rémi Coulom
Hi,

I have just found out that the list is not sending emails to my free.fr
email address any more. So I subscribed with my gmail address, which I hope
should work better.

I had missed that very interesting message by David Wu (
http://computer-go.org/pipermail/computer-go/2019-March/010991.html).

I simply wish to share my confirmation that maximizing territory works
well. I have been training Crazy Stone to maximize territory since the
beginning. That's how it reached the top of CGOS after a couple months of
training with only 2-3 (Volta) GPUs in February, 2018. It took Leela
several more months to reach a similar strength, with considerably more
computing power.

Rémi
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

[Computer-go] Hyper-Parameter Sweep on AlphaZero General

2019-03-24 Thread Rémi Coulom
Hi,

Here is a paper you might be interested in:

Abstract:

Since AlphaGo and AlphaGo Zero have achieved breakground successes in the game 
of Go, the programs have been generalized to solve other tasks. Subsequently, 
AlphaZero was developed to play Go, Chess and Shogi. In the literature, the 
algorithms are explained well. However, AlphaZero contains many parameters, and 
for neither AlphaGo, AlphaGo Zero nor AlphaZero, there is sufficient discussion 
about how to set parameter values in these algorithms. Therefore, in this 
paper, we choose 12 parameters in AlphaZero and evaluate how these parameters 
contribute to training. We focus on three objectives~(training loss, time cost 
and playing strength). For each parameter, we train 3 models using 3 different 
values~(minimum value, default value, maximum value). We use the game of play 
6×6 Othello, on the AlphaZeroGeneral open source re-implementation of 
AlphaZero. Overall, experimental results show that different values can lead to 
different training results, proving the importance of such a parameter sweep. 
We categorize these 12 parameters into time-sensitive parameters and 
time-friendly parameters. Moreover, through multi-objective analysis, this 
paper provides an insightful basis for further hyper-parameter optimization.

https://arxiv.org/abs/1903.08129

Rémi
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go