Re: [Computer-go] UEC cup 2nd day
I have used TensorFlow to train a CNN that predicts the next move, with a similar architecture to what others have used (1 layers of 5x5 convolutions followed by 10 more layers of 3x3 convolutions, with 192 hidden units per layer and ReLU activation functions) but with much simpler inputs. I found the Python parts worked quite well, and that's how I did the training, using a GTX 980 GPU. But getting the trained network to work from C++ was a pain, and I ended up rolling out my own code to evaluate it using the CPU(s). I also tried to train the network to predict the final ownership for each point on the board (in the hopes that it would learn about life and death), but this didn't work too well, presumably because I didn't have good enough data to train it with. Álvaro. On Thu, Mar 24, 2016 at 2:42 PM, Darren Cookwrote: > Thanks for the very interesting replies, David, and Remi. > > No-one is using TensorFlow, then? Any reason not to? (I'm just curious > because there looks to be a good Udacity DNN course > (https://www.udacity.com/course/deep-learning--ud730), which I was > considering, but it is using TensorFlow.) > > > Remi wrote: > > programming back-propagation efficiently on the GPU. We did get a GPU > > version working, but it took a lot of time to program it, and was not > > so efficient. So the current DCNN of Crazy Stone is 100% trained on > > the CPU, and 100% running on the CPU. My CPU code is efficient, > > though. It is considerably faster than Caffe. My impression is that > > Caffe is inefficient because it uses the GEMM approach, which may be > > good for high-resolution pictures, but is not for small 19x19 > > boards. > > I did a bit of study on what GEMM is, and found this article and the 2nd > comment on it quite interesting: > > > http://petewarden.com/2015/04/20/why-gemm-is-at-the-heart-of-deep-learning/ > > The comment, by Scott Gray, mentioned: > > So instead of thinking of convolution as a problem of one large gemm > operation, it’s actually much more efficient as many small gemms. To > compute a large gemm on a GPU you need to break it up into many small > tiles anyway. So rather than waste time duplicating your data into a > large matrix, you can just start doing small gemms right away directly > on the data. Let the L2 cache do the duplication for you. > > > He doesn't quantify large vs. small; though I doubt anyone is doing > image recognition on 19x19 pixel images :-) > > Darren > ___ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go > ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] UEC cup 2nd day
Thanks for the very interesting replies, David, and Remi. No-one is using TensorFlow, then? Any reason not to? (I'm just curious because there looks to be a good Udacity DNN course (https://www.udacity.com/course/deep-learning--ud730), which I was considering, but it is using TensorFlow.) Remi wrote: > programming back-propagation efficiently on the GPU. We did get a GPU > version working, but it took a lot of time to program it, and was not > so efficient. So the current DCNN of Crazy Stone is 100% trained on > the CPU, and 100% running on the CPU. My CPU code is efficient, > though. It is considerably faster than Caffe. My impression is that > Caffe is inefficient because it uses the GEMM approach, which may be > good for high-resolution pictures, but is not for small 19x19 > boards. I did a bit of study on what GEMM is, and found this article and the 2nd comment on it quite interesting: http://petewarden.com/2015/04/20/why-gemm-is-at-the-heart-of-deep-learning/ The comment, by Scott Gray, mentioned: So instead of thinking of convolution as a problem of one large gemm operation, it’s actually much more efficient as many small gemms. To compute a large gemm on a GPU you need to break it up into many small tiles anyway. So rather than waste time duplicating your data into a large matrix, you can just start doing small gemms right away directly on the data. Let the L2 cache do the duplication for you. He doesn't quantify large vs. small; though I doubt anyone is doing image recognition on 19x19 pixel images :-) Darren ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] *****SPAM***** Re: UEC cup 2nd day
Which one is Remi's? On Thu, Mar 24, 2016 at 1:09 AM, David Fotlandwrote: > There was one program (Shrike) that had a dnn without search. It didn’t > finish in the top 8. Zen and Crazystone have custom DNN implementations. > Dark Forest uses Torch. The rest used Caffe. > > Remi's implementation is unusual and interesting. I'll let him share it > if he wants to. > > David > > > -Original Message- > > From: Computer-go [mailto:computer-go-boun...@computer-go.org] On > Behalf Of > > Darren Cook > > Sent: Wednesday, March 23, 2016 5:19 AM > > To: computer-go@computer-go.org > > Subject: *SPAM* Re: [Computer-go] UEC cup 2nd day > > > > David Fotland wrote: > > > There are 12 programs here that have deep neural nets. 2 were not > > > qualified for the second day, and six of them made the final 8. Many > > > Faces has very basic DNN support, but it s turned off because it isn t > > > making the program stronger yet. Only Dolburam and Many Faces don t > > > have DNN in the final 8. Dolburam won in Beijing, but the DNN > > > programs are stronger and it didn t make the final 4. > > > > Are all the DNN programs (or, at least, all 6 in the top 8) also using > MCTS? > > (Re-phrased: is there any currently strong program not using MCTS?) > > > > Darren > > ___ > > Computer-go mailing list > > Computer-go@computer-go.org > > http://computer-go.org/mailman/listinfo/computer-go > > ___ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] UEC cup 2nd day
Hi, This UEC Cup was really very exciting. I had started to code my own home-made deep learning library in November, after finishing my Japanese mahjong engine. I was working quietly on it when the Alphago paper was published. Then I felt that I had to urgently get something to work before the UEC Cup. I hired Fabien Letouzey in February. We worked hard for 6 weeks, and improved Crazy Stone tremendously. I chose to build my own deep-learning library. That was a very interesting experience. I underestimated the complexity of programming back-propagation efficiently on the GPU. We did get a GPU version working, but it took a lot of time to program it, and was not so efficient. So the current DCNN of Crazy Stone is 100% trained on the CPU, and 100% running on the CPU. My CPU code is efficient, though. It is considerably faster than Caffe. My impression is that Caffe is inefficient because it uses the GEMM approach, which may be good for high-resolution pictures, but is not for small 19x19 boards. The neural network that I used in the UEC Cup started to learn just 10 days before the tournament, on a 24-core Xeon. I finished the details of incorporating the neural network into MCTS two days before the tournament. I was happily surprised to find that the new version wins 88% of its games against the previous version at one minute / game. Then I connected it to KGS, and it established a strong 7d rank. It was really nice to meet all these new programmers in the UEC Cup. I am sure next year will be very exciting, too. I expect that playing strength will have improved tremendously in one year. Rémi - Mail original - De: "David Fotland"À: computer-go@computer-go.org Envoyé: Jeudi 24 Mars 2016 15:09:11 Objet: Re: [Computer-go] *SPAM* Re: UEC cup 2nd day There was one program (Shrike) that had a dnn without search. It didn’t finish in the top 8. Zen and Crazystone have custom DNN implementations. Dark Forest uses Torch. The rest used Caffe. Remi's implementation is unusual and interesting. I'll let him share it if he wants to. David > -Original Message- > From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of > Darren Cook > Sent: Wednesday, March 23, 2016 5:19 AM > To: computer-go@computer-go.org > Subject: *SPAM* Re: [Computer-go] UEC cup 2nd day > > David Fotland wrote: > > There are 12 programs here that have deep neural nets. 2 were not > > qualified for the second day, and six of them made the final 8. Many > > Faces has very basic DNN support, but it s turned off because it isn t > > making the program stronger yet. Only Dolburam and Many Faces don t > > have DNN in the final 8. Dolburam won in Beijing, but the DNN > > programs are stronger and it didn t make the final 4. > > Are all the DNN programs (or, at least, all 6 in the top 8) also using MCTS? > (Re-phrased: is there any currently strong program not using MCTS?) > > Darren > ___ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] *****SPAM***** Re: UEC cup 2nd day
There was one program (Shrike) that had a dnn without search. It didn’t finish in the top 8. Zen and Crazystone have custom DNN implementations. Dark Forest uses Torch. The rest used Caffe. Remi's implementation is unusual and interesting. I'll let him share it if he wants to. David > -Original Message- > From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of > Darren Cook > Sent: Wednesday, March 23, 2016 5:19 AM > To: computer-go@computer-go.org > Subject: *SPAM* Re: [Computer-go] UEC cup 2nd day > > David Fotland wrote: > > There are 12 programs here that have deep neural nets. 2 were not > > qualified for the second day, and six of them made the final 8. Many > > Faces has very basic DNN support, but it s turned off because it isn t > > making the program stronger yet. Only Dolburam and Many Faces don t > > have DNN in the final 8. Dolburam won in Beijing, but the DNN > > programs are stronger and it didn t make the final 4. > > Are all the DNN programs (or, at least, all 6 in the top 8) also using MCTS? > (Re-phrased: is there any currently strong program not using MCTS?) > > Darren > ___ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go