Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
Hi, as I want to by graphic card for CNN: do I need double precision performance? I give caffe (http://caffe.berkeleyvision.org/) a try, and as far as I understood most is done in single precision?! You get comparable single precision performance NVIDA (as caffe uses CUDA I look for NVIDA) for about 340$ but the double precision performance is 10x smaller than the 1000$ cards thanks a lot Detlef Am Mittwoch, den 24.12.2014, 12:14 +0800 schrieb hughperkins2: Whilst its technically true that you can use an nn with one hidden layer to learn the same function as a deeper net, you might need a combinatorally large number of nodes :-) scaling learning algorithms towards ai, by bengio and lecunn, 2007, makes a convincing case along these lines. ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
No, you don't need double precision at all. Álvaro. On Thu, Dec 25, 2014 at 5:00 AM, Detlef Schmicker d...@physik.de wrote: Hi, as I want to by graphic card for CNN: do I need double precision performance? I give caffe (http://caffe.berkeleyvision.org/) a try, and as far as I understood most is done in single precision?! You get comparable single precision performance NVIDA (as caffe uses CUDA I look for NVIDA) for about 340$ but the double precision performance is 10x smaller than the 1000$ cards thanks a lot Detlef Am Mittwoch, den 24.12.2014, 12:14 +0800 schrieb hughperkins2: Whilst its technically true that you can use an nn with one hidden layer to learn the same function as a deeper net, you might need a combinatorally large number of nodes :-) scaling learning algorithms towards ai, by bengio and lecunn, 2007, makes a convincing case along these lines. ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
as I want to by graphic card for CNN: do I need double precision performance? Personally, i was thinking of experimenting with ints, bytes, and shorts, even less precise than singles :-)___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
You are going to be computing gradients of functions, and most people find it easier to think about these things using a type that roughly corresponds to the notion of real number. You can use a fixed-point representation of reals, which uses ints in the end, but then you have to worry about what scale to use, so you get enough precision but you don't run the risk of overflowing. The only reason I might consider a fixed-point representation is to achieve reproducibility of results. On Thu, Dec 25, 2014 at 5:44 AM, hughperkins2 hughperki...@gmail.com wrote: as I want to by graphic card for CNN: do I need double precision performance? Personally, i was thinking of experimenting with ints, bytes, and shorts, even less precise than singles :-) ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
You can do some GPU experiments on Amazon AWS before you buy. 65 cents per hour David http://aws.amazon.com/ec2/instance-types/ G2 This family includes G2 instances intended for graphics and general purpose GPU compute applications. Features: High Frequency Intel Xeon E5-2670 (Sandy Bridge) Processors High-performance NVIDIA GPU with 1,536 CUDA cores and 4GB of video memory GPU Instances - Current Generation g2.2xlarge $0.650 per Hour -Original Message- From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Detlef Schmicker Sent: Thursday, December 25, 2014 2:00 AM To: computer-go@computer-go.org Subject: Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks Hi, as I want to by graphic card for CNN: do I need double precision performance? I give caffe (http://caffe.berkeleyvision.org/) a try, and as far as I understood most is done in single precision?! You get comparable single precision performance NVIDA (as caffe uses CUDA I look for NVIDA) for about 340$ but the double precision performance is 10x smaller than the 1000$ cards thanks a lot Detlef Am Mittwoch, den 24.12.2014, 12:14 +0800 schrieb hughperkins2: Whilst its technically true that you can use an nn with one hidden layer to learn the same function as a deeper net, you might need a combinatorally large number of nodes :-) scaling learning algorithms towards ai, by bengio and lecunn, 2007, makes a convincing case along these lines. ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
Hi Aja, Couple of questions: 1. connectivity, number of parameters Just to check, each filter connects to all the feature maps below it, is that right? I tried to check that by ball-park estimating number of parameters in that case, and comparing to the section paragraph in your section 4. And that seems to support that hypothesis. But actually my estimate is for some reason under-estimating the number of parameters, by about 20%: Estimated total number of parameters approx = 12 layers * 128 filters * 128 previous featuremaps * 3 * 3 filtersize = 1.8 million But you say 2.3 million. It's similar, so seems feature maps are fully connected to lower level feature maps, but I'm not sure where the extra 500,000 parameters should come from? 2. Symmetry Aja, you say in section 5.1 that adding symmetry does not modify the accuracy, neither higher or lower. Since adding symmetry presumably reduces the number of weights, and therefore increases learning speed, why did you thus decide not to implement symmetry? Hugh ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
This is my guess as to what the number of parameters actually is: First layer: 128 * (5*5*36 + 19*19) (128 filters of size 5x5 on 36 layers of input, position-dependent biases) 11 hidden layers: 11 * 128 * (3*3*128 + 19*19) (128 filters of size 3x3 on 128 layers of input, position-dependent biases) Final layer: 2 *(3*3*128 + 19*19) (2 filters of size 3x3 on 128 layers of input, position-dependent biases) Total number of parameters: 2294738 Did I get that right? I have the same question about the use of symmetry as Hugh. Álvaro. On Thu, Dec 25, 2014 at 8:49 PM, Hugh Perkins hughperk...@gmail.com wrote: Hi Aja, Couple of questions: 1. connectivity, number of parameters Just to check, each filter connects to all the feature maps below it, is that right? I tried to check that by ball-park estimating number of parameters in that case, and comparing to the section paragraph in your section 4. And that seems to support that hypothesis. But actually my estimate is for some reason under-estimating the number of parameters, by about 20%: Estimated total number of parameters approx = 12 layers * 128 filters * 128 previous featuremaps * 3 * 3 filtersize = 1.8 million But you say 2.3 million. It's similar, so seems feature maps are fully connected to lower level feature maps, but I'm not sure where the extra 500,000 parameters should come from? 2. Symmetry Aja, you say in section 5.1 that adding symmetry does not modify the accuracy, neither higher or lower. Since adding symmetry presumably reduces the number of weights, and therefore increases learning speed, why did you thus decide not to implement symmetry? Hugh ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go