Re: [Computer-go] Converging to 57%

2016-08-24 Thread Robert Waite
;
>>
>> On Tue, Aug 23, 2016 at 11:17 PM, David Fotland <fotl...@smart-games.com>
>> wrote:
>>
>>> I train using approximately the same training set as AlphaGo, but so far
>>> without the augmentation with rotations and reflection. My target is about
>>> 55.5%, since that's what Alphago got on their training set without
>>> reinforcement learning.
>>>
>>> I find I need 5x5 in the first layer, at least 12 layers, and at least
>>> 96 filters to get over 50%. My best net is 55.3%, 18 layers by 96 filters.
>>> I use simple SGD with a 64 minibatch, no momentum, 0.01 learning rate until
>>> it flattens out, then 0.001. I have two 980TI, and the best nets take about
>>> 5 days to train (about 20 epochs on about 30M positions). The last few
>>> percent is just trial and error. Sometimes making the net wider or deeper
>>> makes it weaker. Perhaps it's just variation from one training run to
>>> another. I haven’t tried training the same net more than once.
>>>
>>> David
>>>
>>> > -Original Message-
>>> > From: Computer-go [mailto:computer-go-boun...@computer-go.org] On
>>> Behalf
>>> > Of Gian-Carlo Pascutto
>>> > Sent: Tuesday, August 23, 2016 12:42 AM
>>> > To: computer-go@computer-go.org
>>> > Subject: Re: [Computer-go] Converging to 57%
>>> >
>>> > On 23-08-16 08:57, Detlef Schmicker wrote:
>>> >
>>> > > So, if somebody is sure, it is measured against GoGod, I think a
>>> > > number of other go programmers have to think again. I heard them
>>> > > reaching 51% (e. g. posts by Hiroshi in this list)
>>> >
>>> > I trained a 128 x 14 network for Leela 0.7.0 and this gets 51.1% on
>>> > GoGoD.
>>> >
>>> > Something I noticed from the papers is that the prediction percentage
>>> > keeps going upwards with more epochs, even if slowly, but still clearly
>>> > up.
>>> >
>>> > In my experience my networks converge rather quickly (like >0.5% per
>>> > epoch after the first), get stuck, get one more 0.5% gain if I lower
>>> the
>>> > learning rate (by a factor 5 or 10) and don't gain any more regardless
>>> > of what I do thereafter.
>>> >
>>> > I do use momentum. IIRC I tested without momentum once and it was
>>> worse,
>>> > and much slower.
>>> >
>>> > I did not find any improvement in playing strength from doing
>>> Facebook's
>>> > 3 move prediction. Perhaps it needs much bigger networks than 128 x 12.
>>> >
>>> > Adding ladder features also isn't good enough to (consistently) keep
>>> the
>>> > network from playing into them. (And once it's played the first move,
>>> > you're totally SOL because the resulting positions aren't in the
>>> > training set and you'll get 99% confidence for continuing the losing
>>> > ladder moves)
>>> >
>>> > I'm currently doing a more systematic comparison of all methods (and
>>> > GoGoD vs KGS+GoGoD) on 128 x 12, and testing the resulting strength
>>> > (rather than looking at prediction %). I'll post the results here, if
>>> > anything definite comes out of it.
>>> >
>>> > --
>>> > GCP
>>> > ___
>>> > Computer-go mailing list
>>> > Computer-go@computer-go.org
>>> > http://computer-go.org/mailman/listinfo/computer-go
>>>
>>> ___
>>> Computer-go mailing list
>>> Computer-go@computer-go.org
>>> http://computer-go.org/mailman/listinfo/computer-go
>>
>>
>>
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Converging to 57%

2016-08-24 Thread Robert Waite
es it weaker. Perhaps it's just variation from one training run to
>> another. I haven’t tried training the same net more than once.
>>
>> David
>>
>> > -Original Message-
>> > From: Computer-go [mailto:computer-go-boun...@computer-go.org] On
>> Behalf
>> > Of Gian-Carlo Pascutto
>> > Sent: Tuesday, August 23, 2016 12:42 AM
>> > To: computer-go@computer-go.org
>> > Subject: Re: [Computer-go] Converging to 57%
>> >
>> > On 23-08-16 08:57, Detlef Schmicker wrote:
>> >
>> > > So, if somebody is sure, it is measured against GoGod, I think a
>> > > number of other go programmers have to think again. I heard them
>> > > reaching 51% (e. g. posts by Hiroshi in this list)
>> >
>> > I trained a 128 x 14 network for Leela 0.7.0 and this gets 51.1% on
>> > GoGoD.
>> >
>> > Something I noticed from the papers is that the prediction percentage
>> > keeps going upwards with more epochs, even if slowly, but still clearly
>> > up.
>> >
>> > In my experience my networks converge rather quickly (like >0.5% per
>> > epoch after the first), get stuck, get one more 0.5% gain if I lower the
>> > learning rate (by a factor 5 or 10) and don't gain any more regardless
>> > of what I do thereafter.
>> >
>> > I do use momentum. IIRC I tested without momentum once and it was worse,
>> > and much slower.
>> >
>> > I did not find any improvement in playing strength from doing Facebook's
>> > 3 move prediction. Perhaps it needs much bigger networks than 128 x 12.
>> >
>> > Adding ladder features also isn't good enough to (consistently) keep the
>> > network from playing into them. (And once it's played the first move,
>> > you're totally SOL because the resulting positions aren't in the
>> > training set and you'll get 99% confidence for continuing the losing
>> > ladder moves)
>> >
>> > I'm currently doing a more systematic comparison of all methods (and
>> > GoGoD vs KGS+GoGoD) on 128 x 12, and testing the resulting strength
>> > (rather than looking at prediction %). I'll post the results here, if
>> > anything definite comes out of it.
>> >
>> > --
>> > GCP
>> > ___
>> > Computer-go mailing list
>> > Computer-go@computer-go.org
>> > http://computer-go.org/mailman/listinfo/computer-go
>>
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>
>
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Converging to 57%

2016-08-24 Thread Robert Waite
@Detlef It is comforting to hear that GoGoD data seemed to converge towards
51% in your testing. When I ran KGS data... it definitely converged more
quickly but I stopped them short. I think it all makes sense if figure 5 of
the DarkForest paper is the convergence of KGS data... and it doesn't seem
clear... but looking at the paper now... they are comparing with Maddison
and makes sense they would show the numbers for the same dataset.

@GCP the three move strength graphs looked shaky to me... it doesnt seem
like a clear change in strength. For the ladder issue... I think MCTS and a
value or fast rollout network are how AG overcame weaknesses like that. The
fast rollout network is actually the vaguest part to me... i have red some
of the ancestor papers... and can see that people in the field know what
they are describing mostly.. but I don't know where to begin to get the
pattern counts listed in the AG tables at the end of the paper

@David Have you matched your network vs. GnuGo? I think accuracy and loss
are indicators of model health... but playing strength seems diff. The AG
paper only mentions beating Pachi at 100k rollouts with the RL... not the
SL... at 85% winrate. The DarkForest paper shows more data with winrates...
KGS network vs Pachi 10k won ~23% of games.. but GoGoD trained won ~59%.
They also tacked on extended features and 3 step prediction... so who knows.

I am actually feeling a million times better about 51% being the heavy zone
for GoGoD data. Makes my graphs make more sense.

Graphs now:

https://drive.google.com/file/d/0B0BbrXeL6VyCZEJuMG5nVG9NYkU/view

https://drive.google.com/file/d/0B0BbrXeL6VyCR3ZxaUVGNU5pVDQ/view

Gonna keep going with the magenta and black line... figure I can get to 48
percent. I can run 10 million pairs in a day... so the graph width is 1
week. Lol... feel so happy if 57 isnt expected on GoGoD. 51 looks fine and
approachable on my graphs.

For the game phase batched data... the DarkForest paper explicitly calls
out that they got stuck in poor minima without it. I figured that
randomness was fine... but you could definitely get some skews... like no
beginning moves in a minibatch of size 16 like AG. Their paper didn't
elaborate... but did mention 16 threads... to generate a pair.. i select
one random game from all of the available sgf files... and split the game
into 16 sections. I am using threading too.. so more to that... but
basically 16 sets of 16 makes for a 256 minibatch like DarkForest team.

Think the only way to beat Zen or CrazyStone is to get the value network or
fast-rollout with MCTS. Of course... CrazyStone is evolving too... so maybe
not a goal.



On Tue, Aug 23, 2016 at 11:17 PM, David Fotland <fotl...@smart-games.com>
wrote:

> I train using approximately the same training set as AlphaGo, but so far
> without the augmentation with rotations and reflection. My target is about
> 55.5%, since that's what Alphago got on their training set without
> reinforcement learning.
>
> I find I need 5x5 in the first layer, at least 12 layers, and at least 96
> filters to get over 50%. My best net is 55.3%, 18 layers by 96 filters. I
> use simple SGD with a 64 minibatch, no momentum, 0.01 learning rate until
> it flattens out, then 0.001. I have two 980TI, and the best nets take about
> 5 days to train (about 20 epochs on about 30M positions). The last few
> percent is just trial and error. Sometimes making the net wider or deeper
> makes it weaker. Perhaps it's just variation from one training run to
> another. I haven’t tried training the same net more than once.
>
> David
>
> > -Original Message-
> > From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf
> > Of Gian-Carlo Pascutto
> > Sent: Tuesday, August 23, 2016 12:42 AM
> > To: computer-go@computer-go.org
> > Subject: Re: [Computer-go] Converging to 57%
> >
> > On 23-08-16 08:57, Detlef Schmicker wrote:
> >
> > > So, if somebody is sure, it is measured against GoGod, I think a
> > > number of other go programmers have to think again. I heard them
> > > reaching 51% (e. g. posts by Hiroshi in this list)
> >
> > I trained a 128 x 14 network for Leela 0.7.0 and this gets 51.1% on
> > GoGoD.
> >
> > Something I noticed from the papers is that the prediction percentage
> > keeps going upwards with more epochs, even if slowly, but still clearly
> > up.
> >
> > In my experience my networks converge rather quickly (like >0.5% per
> > epoch after the first), get stuck, get one more 0.5% gain if I lower the
> > learning rate (by a factor 5 or 10) and don't gain any more regardless
> > of what I do thereafter.
> >
> > I do use momentum. IIRC I tested without momentum once and it was worse,
> > and much slower.
> >
> > I did not find any impro

Re: [Computer-go] Converging to 57%

2016-08-24 Thread David Fotland
I train using approximately the same training set as AlphaGo, but so far 
without the augmentation with rotations and reflection. My target is about 
55.5%, since that's what Alphago got on their training set without 
reinforcement learning.

I find I need 5x5 in the first layer, at least 12 layers, and at least 96 
filters to get over 50%. My best net is 55.3%, 18 layers by 96 filters. I use 
simple SGD with a 64 minibatch, no momentum, 0.01 learning rate until it 
flattens out, then 0.001. I have two 980TI, and the best nets take about 5 days 
to train (about 20 epochs on about 30M positions). The last few percent is just 
trial and error. Sometimes making the net wider or deeper makes it weaker. 
Perhaps it's just variation from one training run to another. I haven’t tried 
training the same net more than once.

David

> -Original Message-
> From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf
> Of Gian-Carlo Pascutto
> Sent: Tuesday, August 23, 2016 12:42 AM
> To: computer-go@computer-go.org
> Subject: Re: [Computer-go] Converging to 57%
> 
> On 23-08-16 08:57, Detlef Schmicker wrote:
> 
> > So, if somebody is sure, it is measured against GoGod, I think a
> > number of other go programmers have to think again. I heard them
> > reaching 51% (e. g. posts by Hiroshi in this list)
> 
> I trained a 128 x 14 network for Leela 0.7.0 and this gets 51.1% on
> GoGoD.
> 
> Something I noticed from the papers is that the prediction percentage
> keeps going upwards with more epochs, even if slowly, but still clearly
> up.
> 
> In my experience my networks converge rather quickly (like >0.5% per
> epoch after the first), get stuck, get one more 0.5% gain if I lower the
> learning rate (by a factor 5 or 10) and don't gain any more regardless
> of what I do thereafter.
> 
> I do use momentum. IIRC I tested without momentum once and it was worse,
> and much slower.
> 
> I did not find any improvement in playing strength from doing Facebook's
> 3 move prediction. Perhaps it needs much bigger networks than 128 x 12.
> 
> Adding ladder features also isn't good enough to (consistently) keep the
> network from playing into them. (And once it's played the first move,
> you're totally SOL because the resulting positions aren't in the
> training set and you'll get 99% confidence for continuing the losing
> ladder moves)
> 
> I'm currently doing a more systematic comparison of all methods (and
> GoGoD vs KGS+GoGoD) on 128 x 12, and testing the resulting strength
> (rather than looking at prediction %). I'll post the results here, if
> anything definite comes out of it.
> 
> --
> GCP
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Converging to 57%

2016-08-23 Thread Brian Lee
I've been working on my own AlphaGo replication (code on github
https://github.com/brilee/MuGo), and I've found it reasonably easy to hit
45% prediction rate with basic features (stone locations, liberty counts,
and turns since last move), and a relatively small network (6 intermediate
layers, 32 filters in each layer), using Adam / 10e-4 learning rate. This
took ~2 hours on a GTX 960.

As others have mentioned, learning shoots up sharply at the start, and
there is an extremely slow but steady improvement over time. So I'll
experiment with fleshing out more features, increasing size of network, and
longer training periods.

Brian
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Converging to 57%

2016-08-23 Thread Álvaro Begué
There are situations where carefully crafting the minibatches makes sense.
For instance, if you are training an image classifier it is good to build
the minibatches so the classes are evenly represented. In the case of
predicting the next move in go I don't expect this kind of thing will make
much of a difference.

I got to around 52% on a subset of GoGoD using ideas from the ResNet paper (
https://arxiv.org/abs/1512.03385). I used 128x20 followed by 64x20 and
finally 32x20, with skip connections every two layers. I started the
training with Adam(1e-4) and later on I lowered it to 1e-5 and eventually
1e-6. The only inputs I use are the signed count of liberties (positive for
black, negative for white), the age of each stone capped at 8, and a block
of ones indicating where the board is.

I'll be happy to share some code if people are interested.

Álvaro.



On Tue, Aug 23, 2016 at 7:29 AM, Gian-Carlo Pascutto  wrote:

> On 23/08/2016 11:26, Brian Sheppard wrote:
> > The learning rate seems much too high. My experience (which is from
> > backgammon rather than Go, among other caveats) is that you need tiny
> > learning rates. Tiny, as in 1/TrainingSetSize.
>
> I think that's overkill, as in you effectively end up doing batch
> gradient descent instead of mini-batch/SGD.
>
> But yes, 0.01 is rather high with momentum. Try 0.001 for methods with
> momentum, and with the default Adam parameters you have to go even lower
> and try 0.0001.
>
> > Neural networks are dark magic. Be prepared to spend many weeks just
> > trying to figure things out. You can bet that the Google & FB results
> > are just their final runs.
>
> As always it's sad nobody publishes what didn't work saving us the time
> of trying it all over again :-)
>
> > Changing batching to match DarkForest style (making sure that a
> > minibatch contains samples from game phases... for example
> > beginning, middle and end-game).
>
> This sounds a bit suspicious. The entries in your minibatch should be
> randomly selected from your entire training set, so statistically having
> positions from all phases would be guaranteed. (Or you can shuffle the
> entire training set before the epoch, instead of randomly picking during
> it).
>
> Don't feed the positions in in-order or from the same game...
>
> --
> GCP
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Converging to 57%

2016-08-23 Thread Gian-Carlo Pascutto
On 23/08/2016 11:26, Brian Sheppard wrote:
> The learning rate seems much too high. My experience (which is from
> backgammon rather than Go, among other caveats) is that you need tiny
> learning rates. Tiny, as in 1/TrainingSetSize.

I think that's overkill, as in you effectively end up doing batch
gradient descent instead of mini-batch/SGD.

But yes, 0.01 is rather high with momentum. Try 0.001 for methods with
momentum, and with the default Adam parameters you have to go even lower
and try 0.0001.

> Neural networks are dark magic. Be prepared to spend many weeks just
> trying to figure things out. You can bet that the Google & FB results
> are just their final runs.

As always it's sad nobody publishes what didn't work saving us the time
of trying it all over again :-)

> Changing batching to match DarkForest style (making sure that a 
> minibatch contains samples from game phases... for example
> beginning, middle and end-game).

This sounds a bit suspicious. The entries in your minibatch should be
randomly selected from your entire training set, so statistically having
positions from all phases would be guaranteed. (Or you can shuffle the
entire training set before the epoch, instead of randomly picking during
it).

Don't feed the positions in in-order or from the same game...

-- 
GCP
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Converging to 57%

2016-08-23 Thread Brian Sheppard
The learning rate seems much too high. My experience (which is from backgammon 
rather than Go, among other caveats) is that you need tiny learning rates. 
Tiny, as in 1/TrainingSetSize.

 

Neural networks are dark magic. Be prepared to spend many weeks just trying to 
figure things out. You can bet that the Google & FB results are just their 
final runs.

 

From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of 
Robert Waite
Sent: Tuesday, August 23, 2016 2:40 AM
To: computer-go@computer-go.org
Subject: [Computer-go] Converging to 57%

 

I had subscribed to this mailing list back with MoGo... and remember probably 
arguing that the game of go wasn't going to be beat for years and years. I am a 
little late to the game now but was curious if anyone here has worked with 
supervised learning networks like in the AlphaGo paper.

 

I have been training some networks along the lines of the AlphaGo paper and the 
DarkForest paper.. and a couple others... and am working with a single 660gtx. 
I know... laugh... but its a fair benchmark and i'm being cheap for the moment.

 

Breaking 50% accuracy is quite challenging... I have attempted many 
permutations of learning algorithms... and can hit 40% accuracy in perhaps 4-12 
hours... depending on the parameters set. Some things I have tried are using 
default AlphaGo but wtih 128 filters, 32 minibatch size and .01 learning rate, 
changing the optimizer from vanilla SGD to Adam or RMSProp. Changing batching 
to match DarkForest style (making sure that a minibatch contains samples from 
game phases... for example beginning, middle and end-game).Pretty much 
everything seems to converge at a rate that will really stretch out. I am 
planning on picking a line and going with it for an extended training but was 
wondering if anyone has ever gotten close to the convergence rates implied by 
the DarkForest paper.

 

For comparison... Google team had 50 gpus, spend 3 weeks.. and processed 5440M 
state/action pairs. The FB team had 4 gpus, spent 2 weeks and processed 
150M-200M state/action pairs. Both seemed to get to around 57% accuracy with 
their networks.

 

I have also been testing them against GnuGo as a baseline.. and find that GnuGo 
can be beaten rather easily with very little network training... my eye is on 
Pachi... but have to break 50% accuracy i think to even worry about that.

 

Have also played with reinforcement learning phase... started with learning 
rate of .01... which i think was too high that does take quite a bit of 
time on my machine.. so didnt play too much with it yet.

 

Anyway does anyone have any tales of how long it took to break 50%? What is 
the magic bullet that will help me converge there quickly!

 

Here is a long-view graph of various attempts:

 

https://drive.google.com/file/d/0B0BbrXeL6VyCUFRkMlNPbzV2QTQ/view

 

Red and Blue lines are from another member that ran 32 in a minibatch, .01 
learning rate and 128 filters in the middle layers vs. 192. They had 4 k40 gpus 
I believe. They also used 4 training pairs to 4 validation pairs... so 
I imagine that is whey they had such a spread. There is a jump in the accuracy 
which was when learning rate was decreased to .001 I believe.

 

Closer shot:

 

https://drive.google.com/file/d/0B0BbrXeL6VyCRVUxUFJaWVJBdEE/view

 

Most stay between the lines... but looking at both graphs makes me wonder if 
any of the lines are approaching the convergence of DarkForest. My gut tells me 
they were onto something... and am rather curious of the playing strength of 
the DarkForest SL network and the AG SL network.

 

Also... a picture of the network's view on a position... this one was trained 
to 41% accuracy and played itself greedily.

 

https://drive.google.com/file/d/0B0BbrXeL6VyCNkRmVDBIYldraWs/view

 

Oh... and another thing AG used KGS amateur data... FB and my networks have 
been trained on pro games only. At one point I tested the 41% network in the 
image (trained on pro data) and a 44% network trained on amateur (KGS games) 
against GnuGo... and the pro data network soundly won... and the amateur 
network soundly lost... so I stuck with pro since. Not sure if the end result 
is the same... and kinda glad AG team used amateur as that removes the argument 
that it somehow learned Le Sedol's style.

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Converging to 57%

2016-08-23 Thread Gian-Carlo Pascutto
On 23-08-16 08:57, Detlef Schmicker wrote:

> So, if somebody is sure, it is measured against GoGod, I think a 
> number of other go programmers have to think again. I heard them 
> reaching 51% (e. g. posts by Hiroshi in this list)

I trained a 128 x 14 network for Leela 0.7.0 and this gets 51.1% on GoGoD.

Something I noticed from the papers is that the prediction percentage
keeps going upwards with more epochs, even if slowly, but still
clearly up.

In my experience my networks converge rather quickly (like >0.5% per
epoch after the first), get stuck, get one more 0.5% gain if I lower
the learning rate (by a factor 5 or 10) and don't gain any more
regardless of what I do thereafter.

I do use momentum. IIRC I tested without momentum once and it was
worse, and much slower.

I did not find any improvement in playing strength from doing
Facebook's 3 move prediction. Perhaps it needs much bigger networks
than 128 x 12.

Adding ladder features also isn't good enough to (consistently) keep
the network from playing into them. (And once it's played the first
move, you're totally SOL because the resulting positions aren't in the
training set and you'll get 99% confidence for continuing the losing
ladder moves)

I'm currently doing a more systematic comparison of all methods (and
GoGoD vs KGS+GoGoD) on 128 x 12, and testing the resulting strength
(rather than looking at prediction %). I'll post the results here, if
anything definite comes out of it.

-- 
GCP
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Converging to 57%

2016-08-23 Thread Detlef Schmicker
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

good to start this discussion here. I had the discussion some times,
and we (discussion partner and me) were not sure, against which test
set the 57% was measured.

If trained and tested with kgs 6d+ dataset, it seems reasonable to
reach 57% (I reached >55% within 2 weaks, but than changed to GoGod),
with test against GoGod I do not get near that value too (around >51%
within 3 weeks on gtx 970). By the way, both get roughly the same
playing strength in my case.

So, if somebody is sure, it is measured against GoGod, I think a
number of other go programmers have to think again. I heard them
reaching 51% (e. g. posts by Hiroshi in this list)

Detlef

Am 23.08.2016 um 08:40 schrieb Robert Waite:
> I had subscribed to this mailing list back with MoGo... and
> remember probably arguing that the game of go wasn't going to be
> beat for years and years. I am a little late to the game now but
> was curious if anyone here has worked with supervised learning
> networks like in the AlphaGo paper.
> 
> I have been training some networks along the lines of the AlphaGo
> paper and the DarkForest paper.. and a couple others... and am
> working with a single 660gtx. I know... laugh... but its a fair
> benchmark and i'm being cheap for the moment.
> 
> Breaking 50% accuracy is quite challenging... I have attempted
> many permutations of learning algorithms... and can hit 40%
> accuracy in perhaps 4-12 hours... depending on the parameters set.
> Some things I have tried are using default AlphaGo but wtih 128
> filters, 32 minibatch size and .01 learning rate, changing the
> optimizer from vanilla SGD to Adam or RMSProp. Changing batching to
> match DarkForest style (making sure that a minibatch contains
> samples from game phases... for example beginning, middle and 
> end-game).Pretty much everything seems to converge at a rate that
> will really stretch out. I am planning on picking a line and going
> with it for an extended training but was wondering if anyone has
> ever gotten close to the convergence rates implied by the
> DarkForest paper.
> 
> For comparison... Google team had 50 gpus, spend 3 weeks.. and
> processed 5440M state/action pairs. The FB team had 4 gpus, spent 2
> weeks and processed 150M-200M state/action pairs. Both seemed to
> get to around 57% accuracy with their networks.
> 
> I have also been testing them against GnuGo as a baseline.. and
> find that GnuGo can be beaten rather easily with very little
> network training... my eye is on Pachi... but have to break 50%
> accuracy i think to even worry about that.
> 
> Have also played with reinforcement learning phase... started with
> learning rate of .01... which i think was too high that does
> take quite a bit of time on my machine.. so didnt play too much
> with it yet.
> 
> Anyway does anyone have any tales of how long it took to break
> 50%? What is the magic bullet that will help me converge there
> quickly!
> 
> Here is a long-view graph of various attempts:
> 
> https://drive.google.com/file/d/0B0BbrXeL6VyCUFRkMlNPbzV2QTQ/view
> 
> Red and Blue lines are from another member that ran 32 in a
> minibatch, .01 learning rate and 128 filters in the middle layers
> vs. 192. They had 4 k40 gpus I believe. They also used 4
> training pairs to 4 validation pairs... so I imagine that is
> whey they had such a spread. There is a jump in the accuracy which
> was when learning rate was decreased to .001 I believe.
> 
> Closer shot:
> 
> https://drive.google.com/file/d/0B0BbrXeL6VyCRVUxUFJaWVJBdEE/view
> 
> Most stay between the lines... but looking at both graphs makes me
> wonder if any of the lines are approaching the convergence of
> DarkForest. My gut tells me they were onto something... and am
> rather curious of the playing strength of the DarkForest SL network
> and the AG SL network.
> 
> Also... a picture of the network's view on a position... this one
> was trained to 41% accuracy and played itself greedily.
> 
> https://drive.google.com/file/d/0B0BbrXeL6VyCNkRmVDBIYldraWs/view
> 
> Oh... and another thing AG used KGS amateur data... FB and my
> networks have been trained on pro games only. At one point I tested
> the 41% network in the image (trained on pro data) and a 44%
> network trained on amateur (KGS games) against GnuGo... and the pro
> data network soundly won... and the amateur network soundly lost...
> so I stuck with pro since. Not sure if the end result is the
> same... and kinda glad AG team used amateur as that removes the
> argument that it somehow learned Le Sedol's style.
> 
> 
> 
> ___ Computer-go mailing
> list Computer-go@computer-go.org 
> http://computer-go.org/mailman/listinfo/computer-go
> 
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJXu/PCAAoJEInWdHg+Znf4WmkP/iMpRo/jI/ZYh83C0CTOh2HB
1nq4cb9CG1fubaOIfH5T7veu3pqQWTxPLUI5FOIN99BkESc0NbuSP5geoyLy0hVY

[Computer-go] Converging to 57%

2016-08-23 Thread Robert Waite
I had subscribed to this mailing list back with MoGo... and remember
probably arguing that the game of go wasn't going to be beat for years and
years. I am a little late to the game now but was curious if anyone here
has worked with supervised learning networks like in the AlphaGo paper.

I have been training some networks along the lines of the AlphaGo paper and
the DarkForest paper.. and a couple others... and am working with a single
660gtx. I know... laugh... but its a fair benchmark and i'm being cheap for
the moment.

Breaking 50% accuracy is quite challenging... I have attempted many
permutations of learning algorithms... and can hit 40% accuracy in perhaps
4-12 hours... depending on the parameters set. Some things I have tried are
using default AlphaGo but wtih 128 filters, 32 minibatch size and .01
learning rate, changing the optimizer from vanilla SGD to Adam or RMSProp.
Changing batching to match DarkForest style (making sure that a minibatch
contains samples from game phases... for example beginning, middle and
end-game).Pretty much everything seems to converge at a rate that will
really stretch out. I am planning on picking a line and going with it for
an extended training but was wondering if anyone has ever gotten close to
the convergence rates implied by the DarkForest paper.

For comparison... Google team had 50 gpus, spend 3 weeks.. and processed
5440M state/action pairs. The FB team had 4 gpus, spent 2 weeks and
processed 150M-200M state/action pairs. Both seemed to get to around 57%
accuracy with their networks.

I have also been testing them against GnuGo as a baseline.. and find that
GnuGo can be beaten rather easily with very little network training... my
eye is on Pachi... but have to break 50% accuracy i think to even worry
about that.

Have also played with reinforcement learning phase... started with learning
rate of .01... which i think was too high that does take quite a bit of
time on my machine.. so didnt play too much with it yet.

Anyway does anyone have any tales of how long it took to break 50%?
What is the magic bullet that will help me converge there quickly!

Here is a long-view graph of various attempts:

https://drive.google.com/file/d/0B0BbrXeL6VyCUFRkMlNPbzV2QTQ/view

Red and Blue lines are from another member that ran 32 in a minibatch, .01
learning rate and 128 filters in the middle layers vs. 192. They had 4 k40
gpus I believe. They also used 4 training pairs to 4 validation
pairs... so I imagine that is whey they had such a spread. There is a jump
in the accuracy which was when learning rate was decreased to .001 I
believe.

Closer shot:

https://drive.google.com/file/d/0B0BbrXeL6VyCRVUxUFJaWVJBdEE/view

Most stay between the lines... but looking at both graphs makes me wonder
if any of the lines are approaching the convergence of DarkForest. My gut
tells me they were onto something... and am rather curious of the playing
strength of the DarkForest SL network and the AG SL network.

Also... a picture of the network's view on a position... this one was
trained to 41% accuracy and played itself greedily.

https://drive.google.com/file/d/0B0BbrXeL6VyCNkRmVDBIYldraWs/view

Oh... and another thing AG used KGS amateur data... FB and my networks
have been trained on pro games only. At one point I tested the 41% network
in the image (trained on pro data) and a 44% network trained on amateur
(KGS games) against GnuGo... and the pro data network soundly won... and
the amateur network soundly lost... so I stuck with pro since. Not sure if
the end result is the same... and kinda glad AG team used amateur as that
removes the argument that it somehow learned Le Sedol's style.
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go