Let's be pragmatic - humans heavily use the information about the last
move too. If they take a while, they don't need to know the last move
of the opponent when reviewing a position, but when reading a tactical
sequence the previous move in the sequence is essential piece of
information.
Hi Hugh,
On Fri, Dec 26, 2014 at 9:49 AM, Hugh Perkins hughperk...@gmail.com wrote:
Estimated total number of parameters
approx = 12 layers * 128 filters * 128 previous featuremaps * 3 * 3
filtersize
= 1.8 million
But you say 2.3 million. It's similar, so seems feature maps are
fully
Hi,
as I want to by graphic card for CNN: do I need double precision
performance? I give caffe (http://caffe.berkeleyvision.org/) a try, and
as far as I understood most is done in single precision?!
You get comparable single precision performance NVIDA (as caffe uses
CUDA I look for NVIDA) for
No, you don't need double precision at all.
Álvaro.
On Thu, Dec 25, 2014 at 5:00 AM, Detlef Schmicker d...@physik.de wrote:
Hi,
as I want to by graphic card for CNN: do I need double precision
performance? I give caffe (http://caffe.berkeleyvision.org/) a try, and
as far as I understood
as I want to by graphic card for CNN: do I need double precision
performance?
Personally, i was thinking of experimenting with ints, bytes, and shorts, even
less precise than singles :-)___
Computer-go mailing list
Computer-go@computer-go.org
You are going to be computing gradients of functions, and most people find
it easier to think about these things using a type that roughly corresponds
to the notion of real number. You can use a fixed-point representation of
reals, which uses ints in the end, but then you have to worry about what
, December 25, 2014 2:00 AM
To: computer-go@computer-go.org
Subject: Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional
Neural Networks
Hi,
as I want to by graphic card for CNN: do I need double precision
performance? I give caffe (http://caffe.berkeleyvision.org/) a try
Hi Aja,
Couple of questions:
1. connectivity, number of parameters
Just to check, each filter connects to all the feature maps below it,
is that right? I tried to check that by ball-park estimating number
of parameters in that case, and comparing to the section paragraph in
your section 4.
This is my guess as to what the number of parameters actually is:
First layer: 128 * (5*5*36 + 19*19) (128 filters of size 5x5 on 36 layers
of input, position-dependent biases)
11 hidden layers: 11 * 128 * (3*3*128 + 19*19) (128 filters of size 3x3 on
128 layers of input, position-dependent
Hi Aja,
Thanks for a game and report.
I saw sgf, CNN can play ko fight. great.
our best CNN is about 220 to 310 Elo stronger which is consistent
Deeper network and rich info makes +300 Elo? impressive.
Aja, if your CNN+MCTS use Erica's playout, how strong will it be?
I think it will be
Hiroshi Yamashita: 37E4294EAD9142EA84D1031F3E1E9C7C@x60:
Hi Aja,
Thanks for a game and report.
I saw sgf, CNN can play ko fight. great.
our best CNN is about 220 to 310 Elo stronger which is consistent
Deeper network and rich info makes +300 Elo? impressive.
Aja, if your CNN+MCTS use Erica's
I thought that any layers beyond 3 were irrelevant. Probably I'm subsuming
your nn into what I learned about nn's and didn't read anything carefully
enough.
Can you help correct me?
s.
On Dec 23, 2014 6:47 AM, Aja Huang ajahu...@google.com wrote:
On Mon, Dec 22, 2014 at 12:38 PM, David Silver
the network to devote its capacity to advanced concepts
rather than synthesizing basic concepts.
From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of
uurtamo .
Sent: Tuesday, December 23, 2014 7:34 PM
To: computer-go
Subject: Re: [Computer-go] Move Evaluation in Go Using Deep
Whilst its technically true that you can use an nn with one hidden layer to
learn the same function as a deeper net, you might need a combinatorally large
number of nodes :-)
scaling learning algorithms towards ai, by bengio and lecunn, 2007, makes a
convincing case along these lines.
Hi Martin
- Would you be willing to share some of the sgf game records played by your
network with the community? I tried to replay the game record in your
paper, but got stuck since it does not show any of the moves that got
captured.
Sorry about that, we will correct the figure and repost.
Last move info is a strange beast, isn't it? I mean, except for ko
captures, it doesn't really add information to the position. The correct
prediction rate is such an obvious metric, but maybe prediction shouldn't
be improved at any price. To a certain degree, last move info is a kind of
Last move info is a cheap hint for an instable area (unless it is a defense
move).
Thomas
On Mon, 22 Dec 2014, Stefan Kaitschick wrote:
Last move info is a strange beast, isn't it? I mean, except for ko captures, it
doesn't really add information to the position. The
correct prediction rate
On Mon, Dec 22, 2014 at 03:45:47PM +0100, Stefan Kaitschick wrote:
Last move info is a strange beast, isn't it? I mean, except for ko
captures, it doesn't really add information to the position. The correct
prediction rate is such an obvious metric, but maybe prediction shouldn't
be improved
Aja and co-authors,
first of all, congratulations on outstanding results!
I have some questions:
- Would you be willing to share some of the sgf game records played by your
network with the community? I tried to replay the game record in your paper,
but got stuck since it does not show any
On Fri Dec 19 23:17:23 UTC 2014, Aja Huang wrote:
We've just submitted our paper to ICLR. We made the draft available at
http://www.cs.toronto.edu/~cmaddis/pubs/deepgo.pdf
Cool... just out of curiosity, did a back-of-an-envelope estimation of the
cost of training your and Clark and Storkey's
Great work. Looks like the age of nn is here.
How does this compare in computation time to a heavy MC move generator?
One very minor quibble, I feel like a nag for even mentioning it: You write
The most frequently cited reason for the difficulty of Go, compared to
games such as Chess, Scrabble
On 20.12.2014 09:43, Stefan Kaitschick wrote:
If MC has shown anything, it's that computationally, it's much easier to
suggest a good move, than to evaluate the position.
Such can only mean an improper understanding of positional judgement.
Positional judgement depends on reading (or MC
Am Samstag, den 20.12.2014, 09:43 +0100 schrieb Stefan Kaitschick:
Great work. Looks like the age of nn is here.
How does this compare in computation time to a heavy MC move
generator?
One very minor quibble, I feel like a nag for even mentioning it: You
write
The most frequently
On 20.12.2014 11:21, Detlef Schmicker wrote:
it is not easy to get training data sets for an evaluation function?!
You seem be asking for abundant data sets, e.g., with triples Position,
Territory, Influence. Indeed, only dozens are available in the
literature and need a bit of extra work.
Hi,
I am still fighting with the NN slang, but why do you zero-padd the
output (page 3: 4 Architecture Training)?
From all I read up to now, most are zero-padding the input to make the
output fit 19x19?!
Thanks for the great work
Detlef
Am Freitag, den 19.12.2014, 23:17 + schrieb Aja
If you start with a 19x19 grid and you take convolutional filters of size
5x5 (as an example), you'll end up with a board of size 15x15, because a
5x5 box can be placed inside a 19x19 board in 15x15 different locations. We
can get 19x19 outputs if we allow the 5x5 box to be centered on any point,
Thanks for sharing. I'm intrigued by your strategy for integrating
with MCTS. It's clear that latency is a challenge for integration. Do
you have any statistics on how many searches new nodes had been
through by the time the predictor comes back with an estimation? Did
you try any prefetching
Many Faces
adjusts the biases.
David
-Original Message-
From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf
Of Mark Wagner
Sent: Saturday, December 20, 2014 11:18 AM
To: computer-go@computer-go.org
Subject: Re: [Computer-go] Move Evaluation in Go Using Deep
I think many of the programs have a mechanism for dealing with “slow”
knowledge. For example in Fuego, you can call a knowledge function for each
node that reaches some threshold T of playouts. The new technical challenge is
dealing with the GPU. I know nothing about it myself, but from what I
Aja wrote:
We haven't measured that but I think move history is an important feature
since Go is very much about answering the opponent's last move locally
(that's also why in Go we have the term tenuki for not answering the last
move).
I guess you could get some measure of the importance
Hi Mark,
2014-12-20 19:17 GMT+00:00 Mark Wagner wagner.mar...@gmail.com:
Thanks for sharing. I'm intrigued by your strategy for integrating
with MCTS. It's clear that latency is a challenge for integration. Do
you have any statistics on how many searches new nodes had been
through by the
Hi Aja
We've just submitted our paper to ICLR. We made the draft available at
http://www.cs.toronto.edu/~cmaddis/pubs/deepgo.pdf
I hope you enjoy our work. Comments and questions are welcome.
I did not look at the go content, on which I'm no expert.
But for the network training, you might be
On Sat, Dec 20, 2014 at 12:17 AM, Aja Huang ajahu...@google.com wrote:
We've just submitted our paper to ICLR. We made the draft available at
http://www.cs.toronto.edu/~cmaddis/pubs/deepgo.pdf
Hi Aja,
Wow, very impressive. In fact so impressive, it seems a bit
suspicious(*)... If this is real
33 matches
Mail list logo