Re: [Computer-go] agz -- meditations

2017-10-19 Thread Petri Pitkanen
Cost reduction in IC has reached or is reaching its limits. Intels 5n techk
is not really a 5n and 5n is not really reachable. Not at least without
some seriously new physics and even then there will be hard limits like
quantum un--certainty. This particular chip may get cheaper if it is ever
done in amounts but it is not guaranteed to get lot cheapaer

2017-10-20 6:24 GMT+03:00 Robert Jasiek :

> On 19.10.2017 20:13, Richard Lorentz wrote:
>
>> Silver said "algorithms matter much more than ... computing".
>> Hassabis estimated they used US$25 million of hardware.
>>
>
> Today, it seems 4 TPU cost US$25 million. In 5 or 10 years, every computer
> might have its 4-TPU-chip costing $250, if not $25. At least, I hope.
>
> --
> robert jasiek
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-19 Thread Robert Jasiek

So there is a superstrong neural net.

1) Where is the semantic translation of the neural net to human theory 
knowledge?


2) Where is the analysis of the neural net's errors in decision-making?

3) Where is the world-wide discussion preventing a combination of AI and 
(nano-)robots, which self-replicate or permanently ensure energy access, 
from causing extinction of mankind?


--
robert jasiek
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] agz -- meditations

2017-10-19 Thread Robert Jasiek

On 19.10.2017 20:13, Richard Lorentz wrote:

Silver said "algorithms matter much more than ... computing".
Hassabis estimated they used US$25 million of hardware.


Today, it seems 4 TPU cost US$25 million. In 5 or 10 years, every 
computer might have its 4-TPU-chip costing $250, if not $25. At least, I 
hope.


--
robert jasiek
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-19 Thread Álvaro Begué
Yes, residual networks are awesome! I learned about them at ICML 2016 (
http://kaiminghe.com/icml16tutorial/index.html). Kaiming He's exposition
was fantastically clear. I used them in my own attempts at training neural
networks for move prediction. It's fairly easy to train something with 20
layers with residual networks, even without using batch normalization. With
batch normalization apparently you can get to hundreds of layers without
problems, and the models do perform better on the test data for vision
tasks. But I didn't implement that part, and the additional computational
cost probably makes this not worth it for go.

Álvaro.




On Thu, Oct 19, 2017 at 8:51 PM, Brian Sheppard via Computer-go <
computer-go@computer-go.org> wrote:

> So I am reading that residual networks are simply better than normal
> convolutional networks. There is a detailed write-up here:
> https://blog.waya.ai/deep-residual-learning-9610bb62c355
>
> Summary: the residual network has a fixed connection that adds (with no
> scaling) the output of the previous level to the output of the current
> level. The point is that once some layer learns a concept, that concept is
> immediately available to all downstream layers, without need for learning
> how to propagate the value through a complicated network design. These
> connections also provide a fast pathway for tuning deeper layers.
>
> -Original Message-
> From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf
> Of Gian-Carlo Pascutto
> Sent: Wednesday, October 18, 2017 4:33 PM
> To: computer-go@computer-go.org
> Subject: Re: [Computer-go] AlphaGo Zero
>
> On 18/10/2017 19:50, cazen...@ai.univ-paris8.fr wrote:
> >
> > https://deepmind.com/blog/
> >
> > http://www.nature.com/nature/index.html
>
> Select quotes that I find interesting from a brief skim:
>
> 1) Using a residual network was more accurate, achieved lower error, and
> improved performance in AlphaGo by over 600 Elo.
>
> 2) Combining policy and value together into a single network slightly
> reduced the move prediction accuracy, but reduced the value error and
> boosted playing performance in AlphaGo by around another 600 Elo.
>
> These gains sound very high (much higher than previous experiments with
> them reported here), but are likely due to the joint training.
>
> 3) The raw neural network, without using any lookahead, achieved an Elo
> rating of 3,055. ... AlphaGo Zero achieved a rating of 5,185.
>
> The increase of 2000 Elo from tree search sounds very high, but this may
> just mean the value network is simply very good - and perhaps relatively
> better than the policy one. (They previously had problems there that SL
> > RL for the policy network guiding the tree search - but I'm not sure
> there's any relation)
>
> 4) History features Xt; Yt are necessary because Go is not fully
> observable solely from the current stones, as repetitions are forbidden.
>
> This is a weird statement. Did they need 17 planes just to check for ko?
> It seems more likely that history features are very helpful for the
> internal understanding of the network as an optimization. That sucks though
> - it's annoying for analysis and position setup.
>
> Lastly, the entire training procedure is actually not very complicated at
> all, and it's hopeful the training is "faster" than previous approaches -
> but many things look fast if you can throw 64 GPU workers at a problem.
>
> In this context, the graphs of the differing network architectures causing
> huge strength discrepancies are both good and bad. Making a better pick can
> cause you to get massively better results, take a bad pick and you won't
> come close.
>
> --
> GCP
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-19 Thread Brian Sheppard via Computer-go
So I am reading that residual networks are simply better than normal 
convolutional networks. There is a detailed write-up here: 
https://blog.waya.ai/deep-residual-learning-9610bb62c355

Summary: the residual network has a fixed connection that adds (with no 
scaling) the output of the previous level to the output of the current level. 
The point is that once some layer learns a concept, that concept is immediately 
available to all downstream layers, without need for learning how to propagate 
the value through a complicated network design. These connections also provide 
a fast pathway for tuning deeper layers.

-Original Message-
From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of 
Gian-Carlo Pascutto
Sent: Wednesday, October 18, 2017 4:33 PM
To: computer-go@computer-go.org
Subject: Re: [Computer-go] AlphaGo Zero

On 18/10/2017 19:50, cazen...@ai.univ-paris8.fr wrote:
> 
> https://deepmind.com/blog/
> 
> http://www.nature.com/nature/index.html

Select quotes that I find interesting from a brief skim:

1) Using a residual network was more accurate, achieved lower error, and 
improved performance in AlphaGo by over 600 Elo.

2) Combining policy and value together into a single network slightly reduced 
the move prediction accuracy, but reduced the value error and boosted playing 
performance in AlphaGo by around another 600 Elo.

These gains sound very high (much higher than previous experiments with them 
reported here), but are likely due to the joint training.

3) The raw neural network, without using any lookahead, achieved an Elo rating 
of 3,055. ... AlphaGo Zero achieved a rating of 5,185.

The increase of 2000 Elo from tree search sounds very high, but this may just 
mean the value network is simply very good - and perhaps relatively better than 
the policy one. (They previously had problems there that SL
> RL for the policy network guiding the tree search - but I'm not sure
there's any relation)

4) History features Xt; Yt are necessary because Go is not fully observable 
solely from the current stones, as repetitions are forbidden.

This is a weird statement. Did they need 17 planes just to check for ko?
It seems more likely that history features are very helpful for the internal 
understanding of the network as an optimization. That sucks though - it's 
annoying for analysis and position setup.

Lastly, the entire training procedure is actually not very complicated at all, 
and it's hopeful the training is "faster" than previous approaches - but many 
things look fast if you can throw 64 GPU workers at a problem.

In this context, the graphs of the differing network architectures causing huge 
strength discrepancies are both good and bad. Making a better pick can cause 
you to get massively better results, take a bad pick and you won't come close.

--
GCP
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] agz -- meditations

2017-10-19 Thread Cyris Sargon
Sure, both hardware and software / algorithms are needed... but which gets
you the bigger ROI?  { Just a rhetorical question, I know it is not linear
and not a simple question... but in general, I can see David Silver's (&
Richard Lorentz / Demis Hassabis' counter) point }.

May you live in sente,
Cyris

My AGA Go rank: http://agagd.usgo.org/player/13530/


Brian Cloutier  replied:

> Well, if you have both, why not use both :)
>
> On Oct 19th 2017 Richard Lorentz  wrote:
>
>> An interesting juxtaposition.
>>
>> Silver said "algorithms matter much more than ... computing".
>>
>> Hassabis estimated they used US$25 million of hardware.
>> ___
>> Computer-go mailing list
>
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-19 Thread dave.de...@planet.nl
I would like to know how much handicap the Master version needs against the 
Zero version. It could be less than black without komi or more than 3 stones.
Handicap differences cannot be deduced from regular Elo rating differences, 
because it varies depending on skill (a handicap stone is more than 100 regular 
Elo points points for higher dan players and less than 100 regular Elo points 
for kyu players). 
Dave de Vos
 
>Origineel Bericht--
--
>Van : 3-hirn-ver...@gmx.de
>Datum : 19/10/2017 20:53
>Aan : computer-go@computer-go.org
>Onderwerp : Re: [Computer-go] AlphaGo Zero
>
>What shall I say?
>Really impressive.
>My congratulations to the DeepMind team!
>
>> https://deepmind.com/blog/
>> http://www.nature.com/nature/index.html
>
>* Would the same approach also work for integral komi values
>(with the possibility of draws)? If so, what would the likely
>correct komi for 19x19 Go be?
>
>* Or in another way: Looking at Go on NxN board:
>For which values of N would the DeepMind be confident
>to find the correct komi value?
>
>
>* How often are there ko-fights in autoplay games of
>AlphaGo Zero?
>
>Ingo.
>
>PS(a fitting song). The opening theme of
>Djan-Go Unchained (with a march through a desert of stones):
>https://www.youtube.com/watch?v=R1hqn8kKZ_M
>___
>Computer-go mailing list
>Computer-go@computer-go.org
>http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] agz -- meditations

2017-10-19 Thread Brian Cloutier
Well, if you have both, why not use both :)

On Thu, Oct 19, 2017 at 11:51 AM Richard Lorentz 
wrote:

> An interesting juxtaposition.
>
> Silver said "algorithms matter much more than ... computing".
>
> Hassabis estimated they used US$25 million of hardware.
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-19 Thread Ingo Althöfer
What shall I say?
Really impressive.
My congratulations to the DeepMind team!

> https://deepmind.com/blog/
> http://www.nature.com/nature/index.html

* Would the same approach also work for integral komi values
(with the possibility of draws)? If so, what would the likely
correct komi for 19x19 Go be?

* Or in another way: Looking at Go on NxN board:
For which values of N would the DeepMind be confident
to find the correct komi value?


* How often are there ko-fights in autoplay games of
AlphaGo Zero?

Ingo.

PS(a fitting song). The opening theme of
Djan-Go Unchained (with a march through a desert of stones):
https://www.youtube.com/watch?v=R1hqn8kKZ_M
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

[Computer-go] agz -- meditations

2017-10-19 Thread Richard Lorentz

An interesting juxtaposition.

Silver said "algorithms matter much more than ... computing".

Hassabis estimated they used US$25 million of hardware.
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-19 Thread Álvaro Begué
Yes, it seems really odd that they didn't add a plane of all ones. The
"heads" have weights that depend on the location of the board, but all the
other layers can't tell the difference between a lonely stone at (1,1) and
one at (3,3).

In my own experiments (trying to predict human moves) I found that 3 inputs
worked well: signed liberties, age capped at 8, all ones. I think of the
number of liberties as a key part of the game mechanics, so I don't think
it detracts from the purity of the approach, and it's probably helpful for
learning about life and death.

Álvaro.




On Thu, Oct 19, 2017 at 7:42 AM, Gian-Carlo Pascutto  wrote:

> On 18-10-17 19:50, cazen...@ai.univ-paris8.fr wrote:
> >
> > https://deepmind.com/blog/
> >
> > http://www.nature.com/nature/index.html
>
> Another interesting tidbit:
>
> The inputs don't contain a reliable board edge. The "white to move"
> plane contains it, but only when white is to move.
>
> So until AG Zero "black" learned that a go board is 19 x 19, the white
> player had a serious advantage.
>
> I think I will use 18 input layers :-)
>
> --
> GCP
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-19 Thread Petr Baudis
  The order of magnitude matches my parameter numbers.  (My attempt to
reproduce a simplified version of this is currently evolving at
https://github.com/pasky/michi/tree/nnet but the code is a mess right
now.)

On Thu, Oct 19, 2017 at 07:23:31AM -0400, Álvaro Begué wrote:
> This is a quick check of my understanding of the network architecture.
> Let's count the number of parameters in the model:
>  * convolutional block: (17*9+1)*256 + 2*256
> [ 17 = number of input channels
>9 = size of the 3x3 convolution window
>1 = bias (I am not sure this is needed if you are going to do batch
> normalization immediately after)
>  256 = number of output channels
>2 = mean and standard deviation of the output of the batch normalization
>  256 = number of channels in the batch normalization ]
>  * residual block: (256*9+1)*256 + 2*256 + (256*9+1)*256 + 2*256
>  * policy head: (256*1+1)*2 + 2*2 + (2*361+1)*362
>  * value head: (256*1+1)*1 + 2*1 + (1*361+1)*256 + (256+1)*1
> 
> Summing it all up, I get 22,837,864 parameters for the 20-block network and
> 46,461,544 parameters for the 40-block network.
> 
> Does this seem correct?
> 
> Álvaro.
> 
> 
> 
> On Thu, Oct 19, 2017 at 6:17 AM, Petr Baudis  wrote:
> 
> > On Wed, Oct 18, 2017 at 04:29:47PM -0700, David Doshay wrote:
> > > I saw my first AlphaGo Zero joke today:
> > >
> > > After a few more months of self-play the games might look like this:
> > >
> > > AlphaGo Zero Black - move 1
> > > AlphaGo Zero White - resigns
> >
> > ...which is exactly what my quick attempt to reproduce AlphaGo Zero
> > yesterday converged to overnight. ;-)  But I'm afraid it's because of
> > a bug, not wisdom...
> >
> > Petr Baudis
> > ___
> > Computer-go mailing list
> > Computer-go@computer-go.org
> > http://computer-go.org/mailman/listinfo/computer-go
> >

> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go


-- 
Petr Baudis, Rossum
Run before you walk! Fly before you crawl! Keep moving forward!
If we fail, I'd rather fail really hugely.  -- Moist von Lipwig
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-19 Thread Gian-Carlo Pascutto
On 18-10-17 19:50, cazen...@ai.univ-paris8.fr wrote:
> 
> https://deepmind.com/blog/
> 
> http://www.nature.com/nature/index.html

Another interesting tidbit:

The inputs don't contain a reliable board edge. The "white to move"
plane contains it, but only when white is to move.

So until AG Zero "black" learned that a go board is 19 x 19, the white
player had a serious advantage.

I think I will use 18 input layers :-)

-- 
GCP
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-19 Thread Álvaro Begué
This is a quick check of my understanding of the network architecture.
Let's count the number of parameters in the model:
 * convolutional block: (17*9+1)*256 + 2*256
[ 17 = number of input channels
   9 = size of the 3x3 convolution window
   1 = bias (I am not sure this is needed if you are going to do batch
normalization immediately after)
 256 = number of output channels
   2 = mean and standard deviation of the output of the batch normalization
 256 = number of channels in the batch normalization ]
 * residual block: (256*9+1)*256 + 2*256 + (256*9+1)*256 + 2*256
 * policy head: (256*1+1)*2 + 2*2 + (2*361+1)*362
 * value head: (256*1+1)*1 + 2*1 + (1*361+1)*256 + (256+1)*1

Summing it all up, I get 22,837,864 parameters for the 20-block network and
46,461,544 parameters for the 40-block network.

Does this seem correct?

Álvaro.



On Thu, Oct 19, 2017 at 6:17 AM, Petr Baudis  wrote:

> On Wed, Oct 18, 2017 at 04:29:47PM -0700, David Doshay wrote:
> > I saw my first AlphaGo Zero joke today:
> >
> > After a few more months of self-play the games might look like this:
> >
> > AlphaGo Zero Black - move 1
> > AlphaGo Zero White - resigns
>
> ...which is exactly what my quick attempt to reproduce AlphaGo Zero
> yesterday converged to overnight. ;-)  But I'm afraid it's because of
> a bug, not wisdom...
>
> Petr Baudis
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-19 Thread Aja Huang via Computer-go
On Thu, Oct 19, 2017 at 11:04 AM, Hiroshi Yamashita 
wrote:

> I have two questions.
>
> 2017 Jan, Master , defeat 60 pros in a row.
> 2017 May, Master?, defeat Ke Jie 3-0.
>
> Master is Zero method with rollout.
> Zero   is Zero method without rollout.
>
> Did AlphaGo that played with Ke Jie use rollout?
> Is Zero with rollout stronger than Zero without rollout?
>

Hi Hiroshi,

I think these are good questions. You can ask them at
https://www.reddit.com/r/MachineLearning/comments/76xjb5/ama_we_are_david_silver_and_julian_schrittwieser/

Aja


> Thanks,
> Hiroshi Yamashita
>
> - Original Message - From: 
> To: 
> Sent: Thursday, October 19, 2017 2:50 AM
> Subject: [Computer-go] AlphaGo Zero
>
>
>
>> https://deepmind.com/blog/
>>
>> http://www.nature.com/nature/index.html
>>
>> Impressive!
>>
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-19 Thread Petr Baudis
On Wed, Oct 18, 2017 at 04:29:47PM -0700, David Doshay wrote:
> I saw my first AlphaGo Zero joke today:
> 
> After a few more months of self-play the games might look like this:
> 
> AlphaGo Zero Black - move 1
> AlphaGo Zero White - resigns

...which is exactly what my quick attempt to reproduce AlphaGo Zero
yesterday converged to overnight. ;-)  But I'm afraid it's because of
a bug, not wisdom...

Petr Baudis
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

2017-10-19 Thread Hiroshi Yamashita

I have two questions.

2017 Jan, Master , defeat 60 pros in a row.
2017 May, Master?, defeat Ke Jie 3-0.

Master is Zero method with rollout.
Zero   is Zero method without rollout.

Did AlphaGo that played with Ke Jie use rollout?
Is Zero with rollout stronger than Zero without rollout?

Thanks,
Hiroshi Yamashita

- Original Message - 
From: 

To: 
Sent: Thursday, October 19, 2017 2:50 AM
Subject: [Computer-go] AlphaGo Zero



https://deepmind.com/blog/

http://www.nature.com/nature/index.html

Impressive!


___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go