Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-25 Thread Detlef Schmicker
Hi,

as I want to by graphic card for CNN: do I need double precision
performance? I give caffe (http://caffe.berkeleyvision.org/) a try, and
as far as I understood most is done in single precision?!

You get comparable single precision performance NVIDA (as caffe uses
CUDA I look for NVIDA) for about 340$ but the double precision
performance is 10x smaller than the 1000$ cards

thanks a lot

Detlef

Am Mittwoch, den 24.12.2014, 12:14 +0800 schrieb hughperkins2:
 Whilst its technically true that you can use an nn with one hidden
 layer to learn the same function as a deeper net, you might need a
 combinatorally large number of nodes :-)
 
 
 scaling learning algorithms towards ai, by bengio and lecunn, 2007,
 makes a convincing case along these lines. 
 
 
 
 ___
 Computer-go mailing list
 Computer-go@computer-go.org
 http://computer-go.org/mailman/listinfo/computer-go


___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-25 Thread Álvaro Begué
No, you don't need double precision at all.

Álvaro.


On Thu, Dec 25, 2014 at 5:00 AM, Detlef Schmicker d...@physik.de wrote:

 Hi,

 as I want to by graphic card for CNN: do I need double precision
 performance? I give caffe (http://caffe.berkeleyvision.org/) a try, and
 as far as I understood most is done in single precision?!

 You get comparable single precision performance NVIDA (as caffe uses
 CUDA I look for NVIDA) for about 340$ but the double precision
 performance is 10x smaller than the 1000$ cards

 thanks a lot

 Detlef

 Am Mittwoch, den 24.12.2014, 12:14 +0800 schrieb hughperkins2:
  Whilst its technically true that you can use an nn with one hidden
  layer to learn the same function as a deeper net, you might need a
  combinatorally large number of nodes :-)
 
 
  scaling learning algorithms towards ai, by bengio and lecunn, 2007,
  makes a convincing case along these lines.
 
 
 
  ___
  Computer-go mailing list
  Computer-go@computer-go.org
  http://computer-go.org/mailman/listinfo/computer-go


 ___
 Computer-go mailing list
 Computer-go@computer-go.org
 http://computer-go.org/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-25 Thread hughperkins2
 as I want to by graphic card for CNN: do I need double precision
performance? 

Personally, i was thinking of experimenting with ints, bytes, and shorts, even 
less precise than singles :-)___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-25 Thread Álvaro Begué
You are going to be computing gradients of functions, and most people find
it easier to think about these things using a type that roughly corresponds
to the notion of real number. You can use a fixed-point representation of
reals, which uses ints in the end, but then you have to worry about what
scale to use, so you get enough precision but you don't run the risk of
overflowing.

The only reason I might consider a fixed-point representation is to achieve
reproducibility of results.




On Thu, Dec 25, 2014 at 5:44 AM, hughperkins2 hughperki...@gmail.com
wrote:

  as I want to by graphic card for CNN: do I need double precision
 performance?

 Personally, i was thinking of experimenting with ints, bytes, and shorts,
 even less precise than singles :-)

 ___
 Computer-go mailing list
 Computer-go@computer-go.org
 http://computer-go.org/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-25 Thread David Fotland
You can do some GPU experiments on Amazon AWS before you buy.  65 cents per hour

David

http://aws.amazon.com/ec2/instance-types/

G2
This family includes G2 instances intended for graphics and general purpose GPU 
compute applications. 
Features:

High Frequency Intel Xeon E5-2670 (Sandy Bridge) Processors
High-performance NVIDIA GPU with 1,536 CUDA cores and 4GB of video memory

GPU Instances - Current Generation
g2.2xlarge  $0.650 per Hour

 -Original Message-
 From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf
 Of Detlef Schmicker
 Sent: Thursday, December 25, 2014 2:00 AM
 To: computer-go@computer-go.org
 Subject: Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional
 Neural Networks
 
 Hi,
 
 as I want to by graphic card for CNN: do I need double precision
 performance? I give caffe (http://caffe.berkeleyvision.org/) a try, and as
 far as I understood most is done in single precision?!
 
 You get comparable single precision performance NVIDA (as caffe uses CUDA
 I look for NVIDA) for about 340$ but the double precision performance is
 10x smaller than the 1000$ cards
 
 thanks a lot
 
 Detlef
 
 Am Mittwoch, den 24.12.2014, 12:14 +0800 schrieb hughperkins2:
  Whilst its technically true that you can use an nn with one hidden
  layer to learn the same function as a deeper net, you might need a
  combinatorally large number of nodes :-)
 
 
  scaling learning algorithms towards ai, by bengio and lecunn, 2007,
  makes a convincing case along these lines.
 
 
 
  ___
  Computer-go mailing list
  Computer-go@computer-go.org
  http://computer-go.org/mailman/listinfo/computer-go
 
 
 ___
 Computer-go mailing list
 Computer-go@computer-go.org
 http://computer-go.org/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-25 Thread Hugh Perkins
Hi Aja,

Couple of questions:

1. connectivity, number of parameters

Just to check, each filter connects to all the feature maps below it,
is that right?  I tried to check that by ball-park estimating number
of parameters in that case, and comparing to the section paragraph in
your section 4.  And that seems to support that hypothesis.  But
actually my estimate is for some reason under-estimating the number of
parameters, by about 20%:

Estimated total number of parameters
approx = 12 layers * 128 filters * 128 previous featuremaps * 3 * 3 filtersize
= 1.8 million

But you say 2.3 million.  It's similar, so seems feature maps are
fully connected to lower level feature maps, but I'm not sure where
the extra 500,000 parameters should come from?

2. Symmetry

Aja, you say in section 5.1 that adding symmetry does not modify the
accuracy, neither higher or lower.  Since adding symmetry presumably
reduces the number of weights, and therefore increases learning speed,
why did you thus decide not to implement symmetry?

Hugh
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-25 Thread Álvaro Begué
This is my guess as to what the number of parameters actually is:
First layer: 128 * (5*5*36 + 19*19) (128 filters of size 5x5 on 36 layers
of input, position-dependent biases)
11 hidden layers: 11 * 128 * (3*3*128 + 19*19) (128 filters of size 3x3 on
128 layers of input, position-dependent biases)
Final layer: 2 *(3*3*128 + 19*19) (2 filters of size 3x3 on 128 layers of
input, position-dependent biases)

Total number of parameters: 2294738

Did I get that right?

I have the same question about the use of symmetry as Hugh.

Álvaro.


On Thu, Dec 25, 2014 at 8:49 PM, Hugh Perkins hughperk...@gmail.com wrote:

 Hi Aja,

 Couple of questions:

 1. connectivity, number of parameters

 Just to check, each filter connects to all the feature maps below it,
 is that right?  I tried to check that by ball-park estimating number
 of parameters in that case, and comparing to the section paragraph in
 your section 4.  And that seems to support that hypothesis.  But
 actually my estimate is for some reason under-estimating the number of
 parameters, by about 20%:

 Estimated total number of parameters
 approx = 12 layers * 128 filters * 128 previous featuremaps * 3 * 3
 filtersize
 = 1.8 million

 But you say 2.3 million.  It's similar, so seems feature maps are
 fully connected to lower level feature maps, but I'm not sure where
 the extra 500,000 parameters should come from?

 2. Symmetry

 Aja, you say in section 5.1 that adding symmetry does not modify the
 accuracy, neither higher or lower.  Since adding symmetry presumably
 reduces the number of weights, and therefore increases learning speed,
 why did you thus decide not to implement symmetry?

 Hugh
 ___
 Computer-go mailing list
 Computer-go@computer-go.org
 http://computer-go.org/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go