Re: [Computer-go] UEC cup 2nd day

2016-03-24 Thread Álvaro Begué
I have used TensorFlow to train a CNN that predicts the next move, with a
similar architecture to what others have used (1 layers of 5x5 convolutions
followed by 10 more layers of 3x3 convolutions, with 192 hidden units per
layer and ReLU activation functions) but with much simpler inputs. I found
the Python parts worked quite well, and that's how I did the training,
using a GTX 980 GPU. But getting the trained network to work from C++ was a
pain, and I ended up rolling out my own code to evaluate it using the
CPU(s).

I also tried to train the network to predict the final ownership for each
point on the board (in the hopes that it would learn about life and death),
but this didn't work too well, presumably because I didn't have good enough
data to train it with.

Álvaro.



On Thu, Mar 24, 2016 at 2:42 PM, Darren Cook  wrote:

> Thanks for the very interesting replies, David, and Remi.
>
> No-one is using TensorFlow, then? Any reason not to? (I'm just curious
> because there looks to be a good Udacity DNN course
> (https://www.udacity.com/course/deep-learning--ud730), which I was
> considering, but it is using TensorFlow.)
>
>
> Remi wrote:
> > programming back-propagation efficiently on the GPU. We did get a GPU
> > version working, but it took a lot of time to program it, and was not
> > so efficient. So the current DCNN of Crazy Stone is 100% trained on
> > the CPU, and 100% running on the CPU. My CPU code is efficient,
> > though. It is considerably faster than Caffe. My impression is that
> > Caffe is inefficient because it uses the GEMM approach, which may be
> > good for high-resolution pictures, but is not for small 19x19
> > boards.
>
> I did a bit of study on what GEMM is, and found this article and the 2nd
> comment on it quite interesting:
>
>
> http://petewarden.com/2015/04/20/why-gemm-is-at-the-heart-of-deep-learning/
>
> The comment, by Scott Gray, mentioned:
>
>   So instead of thinking of convolution as a problem of one large gemm
> operation, it’s actually much more efficient as many small gemms. To
> compute a large gemm on a GPU you need to break it up into many small
> tiles anyway. So rather than waste time duplicating your data into a
> large matrix, you can just start doing small gemms right away directly
> on the data. Let the L2 cache do the duplication for you.
>
>
> He doesn't quantify large vs. small; though I doubt anyone is doing
> image recognition on 19x19 pixel images :-)
>
> Darren
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] UEC cup 2nd day

2016-03-24 Thread Darren Cook
Thanks for the very interesting replies, David, and Remi.

No-one is using TensorFlow, then? Any reason not to? (I'm just curious
because there looks to be a good Udacity DNN course
(https://www.udacity.com/course/deep-learning--ud730), which I was
considering, but it is using TensorFlow.)


Remi wrote:
> programming back-propagation efficiently on the GPU. We did get a GPU
> version working, but it took a lot of time to program it, and was not
> so efficient. So the current DCNN of Crazy Stone is 100% trained on
> the CPU, and 100% running on the CPU. My CPU code is efficient,
> though. It is considerably faster than Caffe. My impression is that
> Caffe is inefficient because it uses the GEMM approach, which may be
> good for high-resolution pictures, but is not for small 19x19
> boards.

I did a bit of study on what GEMM is, and found this article and the 2nd
comment on it quite interesting:


http://petewarden.com/2015/04/20/why-gemm-is-at-the-heart-of-deep-learning/

The comment, by Scott Gray, mentioned:

  So instead of thinking of convolution as a problem of one large gemm
operation, it’s actually much more efficient as many small gemms. To
compute a large gemm on a GPU you need to break it up into many small
tiles anyway. So rather than waste time duplicating your data into a
large matrix, you can just start doing small gemms right away directly
on the data. Let the L2 cache do the duplication for you.


He doesn't quantify large vs. small; though I doubt anyone is doing
image recognition on 19x19 pixel images :-)

Darren
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] *****SPAM***** Re: UEC cup 2nd day

2016-03-24 Thread Jim O'Flaherty
Which one is Remi's?

On Thu, Mar 24, 2016 at 1:09 AM, David Fotland 
wrote:

> There was one program (Shrike) that had a dnn without search.  It didn’t
> finish in the top 8.  Zen and Crazystone have custom DNN implementations.
> Dark Forest uses Torch.  The rest used Caffe.
>
> Remi's implementation is unusual and interesting.  I'll let him share it
> if he wants to.
>
> David
>
> > -Original Message-
> > From: Computer-go [mailto:computer-go-boun...@computer-go.org] On
> Behalf Of
> > Darren Cook
> > Sent: Wednesday, March 23, 2016 5:19 AM
> > To: computer-go@computer-go.org
> > Subject: *SPAM* Re: [Computer-go] UEC cup 2nd day
> >
> > David Fotland wrote:
> > > There are 12 programs here that have deep neural nets.  2 were not
> > > qualified for the second day, and six of them made the final 8.  Many
> > > Faces has very basic DNN support, but it s turned off because it isn t
> > > making the program stronger yet.  Only Dolburam and Many Faces don t
> > > have DNN in the final 8.  Dolburam won in Beijing, but the DNN
> > > programs are stronger and it didn t make the final 4.
> >
> > Are all the DNN programs (or, at least, all 6 in the top 8) also using
> MCTS?
> > (Re-phrased: is there any currently strong program not using MCTS?)
> >
> > Darren
> > ___
> > Computer-go mailing list
> > Computer-go@computer-go.org
> > http://computer-go.org/mailman/listinfo/computer-go
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] UEC cup 2nd day

2016-03-24 Thread Rémi Coulom
Hi,

This UEC Cup was really very exciting.

I had started to code my own home-made deep learning library in November, after 
finishing my Japanese mahjong engine. I was working quietly on it when the 
Alphago paper was published. Then I felt that I had to urgently get something 
to work before the UEC Cup. I hired Fabien Letouzey in February. We worked hard 
for 6 weeks, and improved Crazy Stone tremendously.

I chose to build my own deep-learning library. That was a very interesting 
experience. I underestimated the complexity of programming back-propagation 
efficiently on the GPU. We did get a GPU version working, but it took a lot of 
time to program it, and was not so efficient. So the current DCNN of Crazy 
Stone is 100% trained on the CPU, and 100% running on the CPU. My CPU code is 
efficient, though. It is considerably faster than Caffe. My impression is that 
Caffe is inefficient because it uses the GEMM approach, which may be good for 
high-resolution pictures, but is not for small 19x19 boards.

The neural network that I used in the UEC Cup started to learn just 10 days 
before the tournament, on a 24-core Xeon. I finished the details of 
incorporating the neural network into MCTS two days before the tournament. I 
was happily surprised to find that the new version wins 88% of its games 
against the previous version at one minute / game. Then I connected it to KGS, 
and it established a strong 7d rank.

It was really nice to meet all these new programmers in the UEC Cup. I am sure 
next year will be very exciting, too. I expect that playing strength will have 
improved tremendously in one year.

Rémi

- Mail original -
De: "David Fotland" 
À: computer-go@computer-go.org
Envoyé: Jeudi 24 Mars 2016 15:09:11
Objet: Re: [Computer-go] *SPAM* Re:  UEC cup 2nd day

There was one program (Shrike) that had a dnn without search.  It didn’t finish 
in the top 8.  Zen and Crazystone have custom DNN implementations.  Dark Forest 
uses Torch.  The rest used Caffe.

Remi's implementation is unusual and interesting.  I'll let him share it if he 
wants to.

David

> -Original Message-
> From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of
> Darren Cook
> Sent: Wednesday, March 23, 2016 5:19 AM
> To: computer-go@computer-go.org
> Subject: *SPAM* Re: [Computer-go] UEC cup 2nd day
> 
> David Fotland wrote:
> > There are 12 programs here that have deep neural nets.  2 were not
> > qualified for the second day, and six of them made the final 8.  Many
> > Faces has very basic DNN support, but it s turned off because it isn t
> > making the program stronger yet.  Only Dolburam and Many Faces don t
> > have DNN in the final 8.  Dolburam won in Beijing, but the DNN
> > programs are stronger and it didn t make the final 4.
> 
> Are all the DNN programs (or, at least, all 6 in the top 8) also using MCTS?
> (Re-phrased: is there any currently strong program not using MCTS?)
> 
> Darren
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] *****SPAM***** Re: UEC cup 2nd day

2016-03-24 Thread David Fotland
There was one program (Shrike) that had a dnn without search.  It didn’t finish 
in the top 8.  Zen and Crazystone have custom DNN implementations.  Dark Forest 
uses Torch.  The rest used Caffe.

Remi's implementation is unusual and interesting.  I'll let him share it if he 
wants to.

David

> -Original Message-
> From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of
> Darren Cook
> Sent: Wednesday, March 23, 2016 5:19 AM
> To: computer-go@computer-go.org
> Subject: *SPAM* Re: [Computer-go] UEC cup 2nd day
> 
> David Fotland wrote:
> > There are 12 programs here that have deep neural nets.  2 were not
> > qualified for the second day, and six of them made the final 8.  Many
> > Faces has very basic DNN support, but it s turned off because it isn t
> > making the program stronger yet.  Only Dolburam and Many Faces don t
> > have DNN in the final 8.  Dolburam won in Beijing, but the DNN
> > programs are stronger and it didn t make the final 4.
> 
> Are all the DNN programs (or, at least, all 6 in the top 8) also using MCTS?
> (Re-phrased: is there any currently strong program not using MCTS?)
> 
> Darren
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go