Re: [Computer-go] 9x9 is last frontier?

2018-03-01 Thread Ingo Althöfer
Von: "David Doshay" 
> Go is hard.
> Programming is hard.
> 
> Programming Go is hard squared. 
> ;^)

And that on square boards.
Mama mia!

;-) Ingo.
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Using 9x9 policy on 13x13 and 19x19

2018-03-01 Thread Hiroshi Yamashita

I found a bug...

In 9_i50_F64L11_p.prototxt, "policy head",

  name: "conv_1x1_2_policy"
num_output: 2

should be

  name: "conv_1x1_1_policy"
num_output: 1


Hiroshi Yamashita

On 2018/03/02 4:21, Hiroshi Yamashita wrote:

Hi Imran,

I have only changed input shape.
On Caffe,

from 9x9
input_dim: 1
input_dim: 50
input_dim: 9
input_dim: 9

to 19x19
input_dim: 1
input_dim: 50
input_dim: 19
input_dim: 19

This is available on fully convolutional.
http://computer-go.org/pipermail/computer-go/2015-December/008324.html

Caffe's prototxt is
http://www.yss-aya.com/20180302using9x9_19x19.tar.gz

Thanks,
Hiroshi Yamashita


On 2018/03/02 0:35, Imran Hendley wrote:

Hi Hiroshi,

Are you using zero-padding to allow input shapes to match for all board sizes?

Thanks,
Imran


___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Using 9x9 policy on 13x13 and 19x19

2018-03-01 Thread Imran Hendley
Perfect, thank you. I was wondering about doing this with a
fully-convolutional architecture, and it looks like that is what you are
doing!

On Thu, Mar 1, 2018 at 2:21 PM, Hiroshi Yamashita  wrote:

> Hi Imran,
>
> I have only changed input shape.
> On Caffe,
>
> from 9x9
> input_dim: 1
> input_dim: 50
> input_dim: 9
> input_dim: 9
>
> to 19x19
> input_dim: 1
> input_dim: 50
> input_dim: 19
> input_dim: 19
>
> This is available on fully convolutional.
> http://computer-go.org/pipermail/computer-go/2015-December/008324.html
>
> Caffe's prototxt is
> http://www.yss-aya.com/20180302using9x9_19x19.tar.gz
>
> Thanks,
> Hiroshi Yamashita
>
>
> On 2018/03/02 0:35, Imran Hendley wrote:
>
>> Hi Hiroshi,
>>
>> Are you using zero-padding to allow input shapes to match for all board
>> sizes?
>>
>> Thanks,
>> Imran
>>
>> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Using 9x9 policy on 13x13 and 19x19

2018-03-01 Thread Hiroshi Yamashita

Hi Imran,

I have only changed input shape.
On Caffe,

from 9x9
input_dim: 1
input_dim: 50
input_dim: 9
input_dim: 9

to 19x19
input_dim: 1
input_dim: 50
input_dim: 19
input_dim: 19

This is available on fully convolutional.
http://computer-go.org/pipermail/computer-go/2015-December/008324.html

Caffe's prototxt is
http://www.yss-aya.com/20180302using9x9_19x19.tar.gz

Thanks,
Hiroshi Yamashita


On 2018/03/02 0:35, Imran Hendley wrote:

Hi Hiroshi,

Are you using zero-padding to allow input shapes to match for all board sizes?

Thanks,
Imran


___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] 9x9 is last frontier?

2018-03-01 Thread David Doshay
Go is hard.
Programming is hard.

Programming Go is hard squared. 
;^)

Cheers,
David G Doshay

ddos...@mac.com


> On 28, Feb 2018, at 5:43 PM, Hideki Kato  wrote:
> 
> Go is still hard for both human and computers :).

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Using 9x9 policy on 13x13 and 19x19

2018-03-01 Thread Imran Hendley
Hi Hiroshi,

Are you using zero-padding to allow input shapes to match for all board
sizes?

Thanks,
Imran

On Thu, Mar 1, 2018 at 12:06 AM, Cornelius  wrote:

> Hi Sighris,
>
> i have always thought that creating algorithms for arbitrary large go
> boards should enlighten us in regards to playing on smaller go boards.
>
> A humans performance doesn't differ that much on differently sized large
> go boards and it scales pretty well. For example one would find it
> rather easy to evaluate large dragons or ladders. The currently used
> neural algorithms (NNs) do not perform well in this regard.
>
> Maybe some form of RNN could be integrated into the evaluation.
>
>
> BR,
> Cornelius
>
>
> Am 24.02.2018 um 08:23 schrieb Sighris:
> > I'm curious, does anybody have any interest in programs for 23x23 (or
> > larger) Go boards?
> >
> > BR,
> > Sighris
> >
> >
> > On Fri, Feb 23, 2018 at 8:58 AM, Erik van der Werf <
> erikvanderw...@gmail.com
> >> wrote:
> >
> >> In the old days I trained separate move predictors on 9x9 games and on
> >> 19x19 games. In my case, the ones trained on 19x19 games beat the ones
> >> trained on 9x9 games also on the 9x9 board. Perhaps it was just because
> of
> >> was having better data from 19x19, but I thought it was interesting to
> see
> >> that the 19x19 predictor generalized well to smaller boards.
> >>
> >> I suppose the result you see can easily be explained; the big board
> policy
> >> learns about large scale and small scale fights, while the small board
> >> policy doesn't know anything about large scale fights.
> >>
> >> BR,
> >> Erik
> >>
> >>
> >> On Fri, Feb 23, 2018 at 5:11 PM, Hiroshi Yamashita 
> >> wrote:
> >>
> >>> Hi,
> >>>
> >>> Using 19x19 policy on 9x9 and 13x13 is effective.
> >>> But opposite is?
> >>> I made 9x9 policy from Aya's 10k playout/move selfplay.
> >>>
> >>> Using 9x9 policy on 13x13 and 19x19
> >>> 19x19 DCNNAyaF128from9x91799
> >>> 13x13 DCNNAyaF128from9x91900
> >>> 9x9   DCNN_AyaF128a558x12290
> >>>
> >>> Using 19x19 policy on 9x9 and 13x13
> >>> 19x19 DCNN_AyaF128a523x12345
> >>> 13x13 DCNNAya795F128a5232354
> >>> 9x9   DCNN_AyaF128a523x12179
> >>>
> >>> 19x19 policy is similar strength on 13x13 and 166 Elo weaker on 9x9.
> >>> 9x9 policy is 390 Elo weaker on 13x13, and 491 Elo weaker on 19x19.
> >>> It seems smaller board is more useless than bigger board...
> >>>
> >>> Note:
> >>> All programs select maximum policy without search.
> >>> All programs use opening book.
> >>> 19x19 policy is Filter128, Layer 12, without Batch Normalization.
> >>> 9x9 policy is Filter128, Layer 11, without Batch Normalization.
> >>> 19x19 policy is made from pro 78000 games, GoGoD.
> >>> 9x9 policy is made from 10k/move. It is CGOS 2892(Aya797c_p1v1_10k).
> >>> Ratings are BayesElo.
> >>>
> >>> Thanks,
> >>> Hiroshi Yamashita
> >>>
> >>> ___
> >>> Computer-go mailing list
> >>
> >>
> >> ___
> >> Computer-go mailing list
> >
> >
> >
> > ___
> > Computer-go mailing list
> > Computer-go@computer-go.org
> > http://computer-go.org/mailman/listinfo/computer-go
> >
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Crazy Stone is back

2018-03-01 Thread Álvaro Begué
> I tried chain pooling too, and it was too slow. It made the network about
twice slower in tensorflow (using tf.unsorted_segment_sum or max). I'd
rather have twice more layers.

tf.unsorted_segment_max didn't exist in the first public release of
TensorFlow, so I requested it just for this purpose (
https://github.com/tensorflow/tensorflow/issues/549). Too bad it's too slow
to be useful.

Thanks for sharing some details of what you have learned so far!

Álvaro.




On Thu, Mar 1, 2018 at 5:48 AM, Rémi Coulom  wrote:

> Hi David,
>
> Thanks for sharing your experiments. It is very interesting.
>
> I tried chain pooling too, and it was too slow. It made the network about
> twice slower in tensorflow (using tf.unsorted_segment_sum or max). I'd
> rather have twice more layers.
>
> I never tried dilated convolutions. That sounds interesting.
>
> The value network of AQ has an interesting architecture. It does not go
> directly from 19x19 to scalar, but works like image-recognition networks,
> with 2x2 pooling until it reaches 1x1. I have not tried it yet, but that
> feels like a good idea.
>
> Rémi
>
> - Mail original -
> De: "David Wu" 
> À: computer-go@computer-go.org
> Envoyé: Mercredi 28 Février 2018 20:04:11
> Objet: Re: [Computer-go] Crazy Stone is back
>
>
>
>
> It's not even just liberties and semeai, it's also eyes. Consider for
> example a large dragon that has miai for 2 eyes in distant locations, and
> the opponent then takes one of them - you'd like the policy net to now
> suggest the other eye-making move far away. And you'd also like the value
> net to distinguish the three situations where the whole group has 2 eyes
> even when they are distant versus the ones where it doesn't.
>
>
> I've been doing experiments with somewhat smaller neural nets (roughly 4-7
> residual blocks = 8-14 layers), without sticking to an idealized "zero"
> approach. I've only experimented with policy nets so far, but presumably
> much of this should also transfer to a value net's understanding too.
>
>
>
> 1. One thing I tried was chain pooling, which was neat, but ultimately
> didn't seem promising:
>
> https://github.com/lightvector/GoNN#chain-pooling
> It solves all of these problems when the strings are solidly connected. It
> helps also when the strings are long but not quite solidly connected too,
> the information still propagates faster than without it. But of course, if
> there are lots of little strings forming a group, diagonal connections,
> bamboo joints, etc, then of course it won't help. And also chain pooling is
> computationally costly, at least in Tensorflow, and it might have negative
> effects on the rest of the neural net that I don't understand.
>
>
>
> 2. A new thing I've been trying recently that actually does seem
> moderately promising is dilated convolutions, although I'm still early in
> testing. They also help increase the speed of information propagation, and
> don't require solidly connected strings, and also are reasonably cheap.
>
>
>
> In particular: my residual blocks have 192 channels, so I tried taking
> several of the later residual blocks in the neural net and making 64 of the
> channels of the first convolution in each block use dilated convolutions
> (leaving 128 channels of regular convolutions), with dilation factors of 2
> or 3. Intuitively, the idea is that earlier blocks could learn to compute
> 2x2 or 3x3 connectivity patterns, and then the dilated convolutions in
> later residual blocks will be able to use that to propagate information
> several spaces at a time across connected groups or dragons.
>
>
> So far, indications are that this works. W hen I looked at it in various
> board positions, it helped in a variety of capturing race and
> large-dragon-two-eye-miai situations, correctly suggesting moves that the
> net without dilated convolutions would fail to find due to the move being
> too far away. Also d ilated convolutions seem pretty cheap - it only
> slightly increases the computational cost of the net.
>
>
> So far, I've found that it doesn't significantly improve the overall loss
> function, presumably because now there are 128 channels instead of 192
> channels of ordinary convolutions, so in return for being better at
> long-distance interactions, the neural net has gotten worse at some local
> tactics. But it also hasn't gotten worse the way it would if I simply
> dropped the number of channels from 192 to 128 without adding any new
> channels, so the dilated convolutions are being "used" for real work.
>
> I'd be curious to hear if anyone else has tried dilated convolutions and
> what results they got. If there's anything at all to do other than just add
> more layers, I think they're the most promising thing I know of.
>
>
>
>
> On Wed, Feb 28, 2018 at 12:34 PM, Rémi Coulom < remi.cou...@free.fr >
> wrote:
>
>
> 192 and 256 are the numbers of channels. They are fully connected, so the
> number of 3x3 

Re: [Computer-go] Crazy Stone is back

2018-03-01 Thread Rémi Coulom
Hi David,

Thanks for sharing your experiments. It is very interesting.

I tried chain pooling too, and it was too slow. It made the network about twice 
slower in tensorflow (using tf.unsorted_segment_sum or max). I'd rather have 
twice more layers.

I never tried dilated convolutions. That sounds interesting.

The value network of AQ has an interesting architecture. It does not go 
directly from 19x19 to scalar, but works like image-recognition networks, with 
2x2 pooling until it reaches 1x1. I have not tried it yet, but that feels like 
a good idea.

Rémi

- Mail original -
De: "David Wu" 
À: computer-go@computer-go.org
Envoyé: Mercredi 28 Février 2018 20:04:11
Objet: Re: [Computer-go] Crazy Stone is back




It's not even just liberties and semeai, it's also eyes. Consider for example a 
large dragon that has miai for 2 eyes in distant locations, and the opponent 
then takes one of them - you'd like the policy net to now suggest the other 
eye-making move far away. And you'd also like the value net to distinguish the 
three situations where the whole group has 2 eyes even when they are distant 
versus the ones where it doesn't. 


I've been doing experiments with somewhat smaller neural nets (roughly 4-7 
residual blocks = 8-14 layers), without sticking to an idealized "zero" 
approach. I've only experimented with policy nets so far, but presumably much 
of this should also transfer to a value net's understanding too. 



1. One thing I tried was chain pooling, which was neat, but ultimately didn't 
seem promising: 

https://github.com/lightvector/GoNN#chain-pooling 
It solves all of these problems when the strings are solidly connected. It 
helps also when the strings are long but not quite solidly connected too, the 
information still propagates faster than without it. But of course, if there 
are lots of little strings forming a group, diagonal connections, bamboo 
joints, etc, then of course it won't help. And also chain pooling is 
computationally costly, at least in Tensorflow, and it might have negative 
effects on the rest of the neural net that I don't understand. 



2. A new thing I've been trying recently that actually does seem moderately 
promising is dilated convolutions, although I'm still early in testing. They 
also help increase the speed of information propagation, and don't require 
solidly connected strings, and also are reasonably cheap. 



In particular: my residual blocks have 192 channels, so I tried taking several 
of the later residual blocks in the neural net and making 64 of the channels of 
the first convolution in each block use dilated convolutions (leaving 128 
channels of regular convolutions), with dilation factors of 2 or 3. 
Intuitively, the idea is that earlier blocks could learn to compute 2x2 or 3x3 
connectivity patterns, and then the dilated convolutions in later residual 
blocks will be able to use that to propagate information several spaces at a 
time across connected groups or dragons. 


So far, indications are that this works. W hen I looked at it in various board 
positions, it helped in a variety of capturing race and 
large-dragon-two-eye-miai situations, correctly suggesting moves that the net 
without dilated convolutions would fail to find due to the move being too far 
away. Also d ilated convolutions seem pretty cheap - it only slightly increases 
the computational cost of the net. 


So far, I've found that it doesn't significantly improve the overall loss 
function, presumably because now there are 128 channels instead of 192 channels 
of ordinary convolutions, so in return for being better at long-distance 
interactions, the neural net has gotten worse at some local tactics. But it 
also hasn't gotten worse the way it would if I simply dropped the number of 
channels from 192 to 128 without adding any new channels, so the dilated 
convolutions are being "used" for real work. 

I'd be curious to hear if anyone else has tried dilated convolutions and what 
results they got. If there's anything at all to do other than just add more 
layers, I think they're the most promising thing I know of. 




On Wed, Feb 28, 2018 at 12:34 PM, Rémi Coulom < remi.cou...@free.fr > wrote: 


192 and 256 are the numbers of channels. They are fully connected, so the 
number of 3x3 filters is 192^2, and 256^2. 

Having liberty counts and string size as input helps, but it solves only a 
small part of the problem. You can't read a semeai from just the liberty-count 
information. 

I tried to be clever and find ways to propagate information along strings in 
the network. But all the techniques I tried make the network much slower. 
Adding more layers is simple and works. 

Rémi 

- Mail original - 
De: "Darren Cook" < dar...@dcook.org > 
À: computer-go@computer-go.org 
Envoyé: Mercredi 28 Février 2018 16:43:10 
Objet: Re: [Computer-go] Crazy Stone is back 



> Weights_31_3200 is 20 layers of 192, 3200 board evaluations per 

Re: [Computer-go] Crazy Stone is back

2018-03-01 Thread Rémi Coulom
Hi Hiroshi,

Yes, Weights_33_400 was trained on 9x9. None of the Weights bot uses playouts. 
I experimented training different network architectures with the same self-play 
data, so that's why newer networks are not necessarily stronger than older ones.

Rémi

- Mail original -
De: "Hiroshi Yamashita" 
À: computer-go@computer-go.org
Envoyé: Mercredi 28 Février 2018 23:24:02
Objet: Re: [Computer-go] Crazy Stone is back

Hi Remi,

Wow, "Weights" is your engine. So my guess was right :-)
In 9x9 CGOS, did you train in 9x9, or just use 19x19 network?
Weights_33_400 is stronger than Weights_40_400.
Maybe it is because Weights_33_400 use CrazyStone's playout, and
  Weights_40_400 does not use?

Thanks,
Hiroshi Yamashita

On 2018/02/28 15:13, Rémi Coulom wrote:
> Hi,
> 
> I have just connected the newest version of Crazy Stone to CGOS. It is based 
> on the AlphaZero approach. The "Weights" engine were in fact previous 
> experimental versions. CrazyStone-18.03 is using time control and pondering 
> instead of a fixed number of evaluations per move. So it should be much 
> stronger than Weights_31_3200.
> 
> Does anybody know who cronus is? It is _extremely_ strong. Its rating is low 
> because it has had only weaker opponents, but it is undefeated so far, except 
> for one loss on time, and some losses against other versions of itself. It 
> has just won two games in a row against Crazy Stone.
> 
> I hope the other strong engines will reconnect, too.
> 
> Rémi
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
> 
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go