Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2015-01-10 Thread Stefan Kaitschick

 Let's be pragmatic - humans heavily use the information about the last
 move too.  If they take a while, they don't need to know the last move
 of the opponent when reviewing a position, but when reading a tactical
 sequence the previous move in the sequence is essential piece of
 information.

 --
 Petr Baudis


So maybe there is a greater justification for using last move info in the
playout than in the tree?

Stefan
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-26 Thread Aja Huang
Hi Hugh,

On Fri, Dec 26, 2014 at 9:49 AM, Hugh Perkins hughperk...@gmail.com wrote:

 Estimated total number of parameters
 approx = 12 layers * 128 filters * 128 previous featuremaps * 3 * 3
 filtersize
 = 1.8 million

 But you say 2.3 million.  It's similar, so seems feature maps are
 fully connected to lower level feature maps, but I'm not sure where
 the extra 500,000 parameters should come from?


You may have forgotten to include the position dependent biases. This is
how I computed the number of parameters

1st layer + 11*middle layers + final layer + 12*middle layer bias + output
bias

5*5*36*128 + 3*3*128*128*11 + 3*3*128*2 + 128*19*19*12 + 2*19*19 = 2,294,738


 2. Symmetry

 Aja, you say in section 5.1 that adding symmetry does not modify the
 accuracy, neither higher or lower.  Since adding symmetry presumably
 reduces the number of weights, and therefore increases learning speed,
 why did you thus decide not to implement symmetry?


We were doing exploratory work that optimized performance not training
time, so we don't know how symmetry affects training time. In terms of
performance it seems not have an effect.

Aja
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-25 Thread Detlef Schmicker
Hi,

as I want to by graphic card for CNN: do I need double precision
performance? I give caffe (http://caffe.berkeleyvision.org/) a try, and
as far as I understood most is done in single precision?!

You get comparable single precision performance NVIDA (as caffe uses
CUDA I look for NVIDA) for about 340$ but the double precision
performance is 10x smaller than the 1000$ cards

thanks a lot

Detlef

Am Mittwoch, den 24.12.2014, 12:14 +0800 schrieb hughperkins2:
 Whilst its technically true that you can use an nn with one hidden
 layer to learn the same function as a deeper net, you might need a
 combinatorally large number of nodes :-)
 
 
 scaling learning algorithms towards ai, by bengio and lecunn, 2007,
 makes a convincing case along these lines. 
 
 
 
 ___
 Computer-go mailing list
 Computer-go@computer-go.org
 http://computer-go.org/mailman/listinfo/computer-go


___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-25 Thread Álvaro Begué
No, you don't need double precision at all.

Álvaro.


On Thu, Dec 25, 2014 at 5:00 AM, Detlef Schmicker d...@physik.de wrote:

 Hi,

 as I want to by graphic card for CNN: do I need double precision
 performance? I give caffe (http://caffe.berkeleyvision.org/) a try, and
 as far as I understood most is done in single precision?!

 You get comparable single precision performance NVIDA (as caffe uses
 CUDA I look for NVIDA) for about 340$ but the double precision
 performance is 10x smaller than the 1000$ cards

 thanks a lot

 Detlef

 Am Mittwoch, den 24.12.2014, 12:14 +0800 schrieb hughperkins2:
  Whilst its technically true that you can use an nn with one hidden
  layer to learn the same function as a deeper net, you might need a
  combinatorally large number of nodes :-)
 
 
  scaling learning algorithms towards ai, by bengio and lecunn, 2007,
  makes a convincing case along these lines.
 
 
 
  ___
  Computer-go mailing list
  Computer-go@computer-go.org
  http://computer-go.org/mailman/listinfo/computer-go


 ___
 Computer-go mailing list
 Computer-go@computer-go.org
 http://computer-go.org/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-25 Thread hughperkins2
 as I want to by graphic card for CNN: do I need double precision
performance? 

Personally, i was thinking of experimenting with ints, bytes, and shorts, even 
less precise than singles :-)___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-25 Thread Álvaro Begué
You are going to be computing gradients of functions, and most people find
it easier to think about these things using a type that roughly corresponds
to the notion of real number. You can use a fixed-point representation of
reals, which uses ints in the end, but then you have to worry about what
scale to use, so you get enough precision but you don't run the risk of
overflowing.

The only reason I might consider a fixed-point representation is to achieve
reproducibility of results.




On Thu, Dec 25, 2014 at 5:44 AM, hughperkins2 hughperki...@gmail.com
wrote:

  as I want to by graphic card for CNN: do I need double precision
 performance?

 Personally, i was thinking of experimenting with ints, bytes, and shorts,
 even less precise than singles :-)

 ___
 Computer-go mailing list
 Computer-go@computer-go.org
 http://computer-go.org/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-25 Thread David Fotland
You can do some GPU experiments on Amazon AWS before you buy.  65 cents per hour

David

http://aws.amazon.com/ec2/instance-types/

G2
This family includes G2 instances intended for graphics and general purpose GPU 
compute applications. 
Features:

High Frequency Intel Xeon E5-2670 (Sandy Bridge) Processors
High-performance NVIDIA GPU with 1,536 CUDA cores and 4GB of video memory

GPU Instances - Current Generation
g2.2xlarge  $0.650 per Hour

 -Original Message-
 From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf
 Of Detlef Schmicker
 Sent: Thursday, December 25, 2014 2:00 AM
 To: computer-go@computer-go.org
 Subject: Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional
 Neural Networks
 
 Hi,
 
 as I want to by graphic card for CNN: do I need double precision
 performance? I give caffe (http://caffe.berkeleyvision.org/) a try, and as
 far as I understood most is done in single precision?!
 
 You get comparable single precision performance NVIDA (as caffe uses CUDA
 I look for NVIDA) for about 340$ but the double precision performance is
 10x smaller than the 1000$ cards
 
 thanks a lot
 
 Detlef
 
 Am Mittwoch, den 24.12.2014, 12:14 +0800 schrieb hughperkins2:
  Whilst its technically true that you can use an nn with one hidden
  layer to learn the same function as a deeper net, you might need a
  combinatorally large number of nodes :-)
 
 
  scaling learning algorithms towards ai, by bengio and lecunn, 2007,
  makes a convincing case along these lines.
 
 
 
  ___
  Computer-go mailing list
  Computer-go@computer-go.org
  http://computer-go.org/mailman/listinfo/computer-go
 
 
 ___
 Computer-go mailing list
 Computer-go@computer-go.org
 http://computer-go.org/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-25 Thread Hugh Perkins
Hi Aja,

Couple of questions:

1. connectivity, number of parameters

Just to check, each filter connects to all the feature maps below it,
is that right?  I tried to check that by ball-park estimating number
of parameters in that case, and comparing to the section paragraph in
your section 4.  And that seems to support that hypothesis.  But
actually my estimate is for some reason under-estimating the number of
parameters, by about 20%:

Estimated total number of parameters
approx = 12 layers * 128 filters * 128 previous featuremaps * 3 * 3 filtersize
= 1.8 million

But you say 2.3 million.  It's similar, so seems feature maps are
fully connected to lower level feature maps, but I'm not sure where
the extra 500,000 parameters should come from?

2. Symmetry

Aja, you say in section 5.1 that adding symmetry does not modify the
accuracy, neither higher or lower.  Since adding symmetry presumably
reduces the number of weights, and therefore increases learning speed,
why did you thus decide not to implement symmetry?

Hugh
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-25 Thread Álvaro Begué
This is my guess as to what the number of parameters actually is:
First layer: 128 * (5*5*36 + 19*19) (128 filters of size 5x5 on 36 layers
of input, position-dependent biases)
11 hidden layers: 11 * 128 * (3*3*128 + 19*19) (128 filters of size 3x3 on
128 layers of input, position-dependent biases)
Final layer: 2 *(3*3*128 + 19*19) (2 filters of size 3x3 on 128 layers of
input, position-dependent biases)

Total number of parameters: 2294738

Did I get that right?

I have the same question about the use of symmetry as Hugh.

Álvaro.


On Thu, Dec 25, 2014 at 8:49 PM, Hugh Perkins hughperk...@gmail.com wrote:

 Hi Aja,

 Couple of questions:

 1. connectivity, number of parameters

 Just to check, each filter connects to all the feature maps below it,
 is that right?  I tried to check that by ball-park estimating number
 of parameters in that case, and comparing to the section paragraph in
 your section 4.  And that seems to support that hypothesis.  But
 actually my estimate is for some reason under-estimating the number of
 parameters, by about 20%:

 Estimated total number of parameters
 approx = 12 layers * 128 filters * 128 previous featuremaps * 3 * 3
 filtersize
 = 1.8 million

 But you say 2.3 million.  It's similar, so seems feature maps are
 fully connected to lower level feature maps, but I'm not sure where
 the extra 500,000 parameters should come from?

 2. Symmetry

 Aja, you say in section 5.1 that adding symmetry does not modify the
 accuracy, neither higher or lower.  Since adding symmetry presumably
 reduces the number of weights, and therefore increases learning speed,
 why did you thus decide not to implement symmetry?

 Hugh
 ___
 Computer-go mailing list
 Computer-go@computer-go.org
 http://computer-go.org/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-23 Thread Hiroshi Yamashita

Hi Aja,

Thanks for a game and report.
I saw sgf, CNN can play ko fight. great.


our best CNN is about 220 to 310 Elo stronger which is consistent


Deeper network and rich info makes +300 Elo? impressive.
Aja, if your CNN+MCTS use Erica's playout, how strong will it be?
I think it will be contender for strongest program.

I also wonder Fuego could release latest version with 1.2, and use
odd number 1.3.x for development.

Regards,
Hiroshi Yamashita

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-23 Thread Hideki Kato
Hiroshi Yamashita: 37E4294EAD9142EA84D1031F3E1E9C7C@x60:
Hi Aja,

Thanks for a game and report.
I saw sgf, CNN can play ko fight. great.

 our best CNN is about 220 to 310 Elo stronger which is consistent

Deeper network and rich info makes +300 Elo? impressive.
Aja, if your CNN+MCTS use Erica's playout, how strong will it be?
I think it will be contender for strongest program.

The playing strength of an MCTS program is dominated by the 
correctness of the simulations, especially of LD.  Prior knowledge 
helps a little.  David pointed out after the first Densei-sen (almost 
three years ago):
All mcts programs have trouble with the positions near the end.  The group
in the center has miai for two eyes.  Same for the group at the top.  The
upper left side group has one big eye shape.  For all three groups the
playouts sometimes kill them.  The black stones are pretty solid, so the
playouts let them survivie.  SO even at the end, zen has 50% win rate, MFGO
has 60%, and pache has 70% win rate for blasck.

Without improving the correctness of the simulations, MCTS programs 
can't go up to next stage.

Hideki
Hideki
-- 
Hideki Kato mailto:hideki_ka...@ybb.ne.jp
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-23 Thread uurtamo .
I thought that any layers beyond 3 were irrelevant. Probably I'm subsuming
your nn into what I learned about nn's and didn't read anything carefully
enough.

Can you help correct me?

s.
On Dec 23, 2014 6:47 AM, Aja Huang ajahu...@google.com wrote:

 On Mon, Dec 22, 2014 at 12:38 PM, David Silver davidstarsil...@gmail.com
 wrote:

 we'll evaluate against Fuego 1.1 and post the results.


 I quickly tested our 12-layer CNN against Fuego 1.1 with 5 secs and 10
 secs per move, 2 threads. The hardware is Intel(R) Xeon(R) CPU E5-2687W 0
 @ 3.10GHz.

 5 secs per move 12-layer CNN scored 55.8% ±5.4
 10 secs per move   12-layer CNN scored 32.9% ±3.8

 Fuego1.1 is clearly much weaker than the latest svn release. And
 interestingly, the network is actually as strong as Fuego 1.1 with 5 secs
 per move.

 Since Clark and Storkey's CNN scored 12% against Fuego 1.1 running on a
 weaker hardware, our best CNN is about 220 to 310 Elo stronger which is
 consistent to the results against GnuGo.

 Regards,
 Aja


 ___
 Computer-go mailing list
 Computer-go@computer-go.org
 http://computer-go.org/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-23 Thread Brian Sheppard
A 3-layer network (input, hidden, output) is sufficient to be a universal 
function approximator, so from a theoretical perspective only 3 layers are 
necessary. But the gap between theoretical and practical is quite large.

 

The CNN architecture builds in translation invariance and sensitivity to local  
phenomena. That gives it a big advantage (on a per distinct weight basis) over 
the flat architecture.

 

Additionally, the input layers of these CNN designs are very important. 
Compared to a stone-by-stone representation, the use of high level concepts in 
the input layer allows the network to devote its capacity to advanced concepts 
rather than synthesizing basic concepts.

 

From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of 
uurtamo .
Sent: Tuesday, December 23, 2014 7:34 PM
To: computer-go
Subject: Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional 
Neural Networks

 

I thought that any layers beyond 3 were irrelevant. Probably I'm subsuming your 
nn into what I learned about nn's and didn't read anything carefully enough.

Can you help correct me?

s.

On Dec 23, 2014 6:47 AM, Aja Huang ajahu...@google.com 
mailto:ajahu...@google.com  wrote:

On Mon, Dec 22, 2014 at 12:38 PM, David Silver davidstarsil...@gmail.com 
mailto:davidstarsil...@gmail.com  wrote:

we'll evaluate against Fuego 1.1 and post the results. 

 

I quickly tested our 12-layer CNN against Fuego 1.1 with 5 secs and 10 secs per 
move, 2 threads. The hardware is Intel(R) Xeon(R) CPU E5-2687W 0 @ 3.10GHz.

 

5 secs per move 12-layer CNN scored 55.8% ±5.4

10 secs per move   12-layer CNN scored 32.9% ±3.8

 

Fuego1.1 is clearly much weaker than the latest svn release. And interestingly, 
the network is actually as strong as Fuego 1.1 with 5 secs per move. 

 

Since Clark and Storkey's CNN scored 12% against Fuego 1.1 running on a weaker 
hardware, our best CNN is about 220 to 310 Elo stronger which is consistent to 
the results against GnuGo.

 

Regards,

Aja

 


___
Computer-go mailing list
Computer-go@computer-go.org mailto:Computer-go@computer-go.org 
http://computer-go.org/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-23 Thread hughperkins2
Whilst its technically true that you can use an nn with one hidden layer to 
learn the same function as a deeper net, you might need a combinatorally large 
number of nodes :-)

scaling learning algorithms towards ai, by bengio and lecunn, 2007, makes a 
convincing case along these lines. 

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-22 Thread David Silver
Hi Martin

- Would you be willing to share some of the sgf game records played by your
 network with the community? I tried to replay the game record in your
 paper, but got stuck since it does not show any of the moves that got
 captured.


Sorry about that, we will correct the figure and repost. In the meanwhile
Aja will post the .sgf for that game. Also, thanks for noticing that we
tested against a stronger version of Fuego than Clark and Storkey, we'll
evaluate against Fuego 1.1 and post the results. Unfortunately, we only
have approval to release the material in the paper, so we can't really give
any further data :-(

One more thing, Aja said he was tired when he measured his own performance
on KGS predictions (working too hard on this paper!) So it would be great
to get better statistics on how humans really do at predicting the next
move. Does anyone want to measure their own performance, say on 200
randomly chosen positions from the KGS data?

- Do you know how large is the effect from using the extra features that
 are not in the paper by Clarke and Storkey, i.e. the last move info and the
 extra tactics? As a related question, would you get an OK result if you
 just zeroed out some inputs in the existing net, or would you need to
 re-train a new network from fewer inputs.


We trained our networks before we knew about Clark and Storkey's results,
so we haven't had a chance to evaluate the differences between the
approaches. But it's well known that last move info makes a big difference
to predictive performance, so I'd guess they would already be close to 50%
predictive accuracy if they included those features.


 - Is there a way to simplify the final network so that it is faster to
 compute and/or easier to understand? Is there something computed, maybe on
 an intermediate layer, that would be usable as a new feature in itself?


This is an interesting idea, but so far we only focused on building a large
and deep enough network to represent Go knowledge at all.

Cheers
Dave
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-22 Thread Stefan Kaitschick
Last move info is a strange beast, isn't it? I mean, except for ko
captures, it doesn't really add information to the position. The correct
prediction rate is such an obvious metric, but maybe prediction shouldn't
be improved at any price. To a certain degree, last move info is a kind of
self-delusion. A predictor that does well without it should be a lot more
robust, even if the percentages are poorer.

Stefan
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-22 Thread Thomas Wolf

Last move info is a cheap hint for an instable area (unless it is a defense
move).

Thomas

On Mon, 22 Dec 2014, Stefan Kaitschick wrote:


Last move info is a strange beast, isn't it? I mean, except for ko captures, it 
doesn't really add information to the position. The
correct prediction rate is such an obvious metric, but maybe prediction 
shouldn't be improved at any price. To a certain degree, last
move info is a kind of self-delusion. A predictor that does well without it 
should be a lot more robust, even if the percentages are
poorer.

Stefan




___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-22 Thread Petr Baudis
On Mon, Dec 22, 2014 at 03:45:47PM +0100, Stefan Kaitschick wrote:
 Last move info is a strange beast, isn't it? I mean, except for ko
 captures, it doesn't really add information to the position. The correct
 prediction rate is such an obvious metric, but maybe prediction shouldn't
 be improved at any price. To a certain degree, last move info is a kind of
 self-delusion. A predictor that does well without it should be a lot more
 robust, even if the percentages are poorer.

Let's be pragmatic - humans heavily use the information about the last
move too.  If they take a while, they don't need to know the last move
of the opponent when reviewing a position, but when reading a tactical
sequence the previous move in the sequence is essential piece of
information.

-- 
Petr Baudis
If you do not work on an important problem, it's unlikely
you'll do important work.  -- R. Hamming
http://www.cs.virginia.edu/~robins/YouAndYourResearch.html
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-21 Thread Martin Mueller
Aja and co-authors,

first of all, congratulations on outstanding results!

I have some questions: 

- Would you be willing to share some of the sgf game records played by your 
network with the community? I tried to replay the game record in your paper, 
but got stuck since it does not show any of the moves that got captured.

- Do you know how large is the effect from using the extra features that are 
not in the paper by Clarke and Storkey, i.e. the last move info and the extra 
tactics? As a related question, would you get an OK result if you just zeroed 
out some inputs in the existing net, or would you need to re-train a new 
network from fewer inputs.

- Is there a way to simplify the final network so that it is faster to compute 
and/or easier to understand? Is there something computed, maybe on an 
intermediate layer, that would be usable as a new feature in itself?

Thanks

Martin
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

[Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-20 Thread Hugh Perkins
On Fri Dec 19 23:17:23 UTC 2014, Aja Huang wrote:
 We've just submitted our paper to ICLR. We made the draft available at
 http://www.cs.toronto.edu/~cmaddis/pubs/deepgo.pdf

Cool... just out of curiosity, did a back-of-an-envelope estimation of the
cost of training your and Clark and Storkey's network, if renting time
on AWS GPU instances and came up with:
   - Clark and Storkey: 125 usd (4 days * 2 instances * 0.65usd/hour)
   - Yours: 2025usd(cost of Clark and Storkey * 25/7 epochs *
   29.4/14.7 action-pairs * 12/8 layers)

Probably a bit high for me personally to just spend one weekend for fun,
but not outrageous at all in fact, if the same technique was being used by
an organization.
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-20 Thread Stefan Kaitschick
Great work. Looks like the age of nn is here.
How does this compare in computation time to a heavy MC move generator?

One very minor quibble, I feel like a nag for even mentioning it:  You write
The most frequently cited reason for the difficulty of Go, compared to
games such as Chess, Scrabble
or Shogi, is the difficulty of constructing an evaluation function that can
differentiate good moves
from bad in a given position.

If MC has shown anything, it's that computationally, it's much easier to
suggest a good move, than to evaluate the position.
This is still true with your paper, it's just that the move suggestion has
become even better.

Stefan
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-20 Thread Robert Jasiek

On 20.12.2014 09:43, Stefan Kaitschick wrote:

If MC has shown anything, it's that computationally, it's much easier to
suggest a good move, than to evaluate the position.


Such can only mean an improper understanding of positional judgement. 
Positional judgement depends on reading (or MC simulation of reading) 
but the reading has a much smaller computational complexity because 
localisation and quiescience apply.


The major aspects of positional judgement are territory and influence. 
Evaluating influence is much easier than evaluating territory if one 
uses a partial influence concept: influence stone difference. Its major 
difficulty is the knowledge of which stones are alive or not, however, 
MC simulations applied to outside stones should be able to assess such 
with reasonable certainty fairly quickly. Hence, the major work of 
positional judgement is assessment of territory. See my book Positional 
Judgement 1 - Territory for that. By designing (heuristically or using a 
low level expert system) MC for its methods, territorial positional 
judgement by MC should be much faster than ordinary MC because much 
fewer simulations should do. However, it is not as elegant as ordinary 
MC because some expert knowledge is necessary or must be approximated 
heuristically. Needless to say, keep the computational complexity of 
this expert knowledge low.


--
robert jasiek
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-20 Thread Detlef Schmicker
Am Samstag, den 20.12.2014, 09:43 +0100 schrieb Stefan Kaitschick:
 Great work. Looks like the age of nn is here.
 
 How does this compare in computation time to a heavy MC move
 generator?
 
 
 One very minor quibble, I feel like a nag for even mentioning it:  You
 write
 The most frequently cited reason for the difficulty of Go, compared
 to games such as Chess, Scrabble
 or Shogi, is the difficulty of constructing an evaluation function
 that can differentiate good moves
 from bad in a given position.
 
 
 If MC has shown anything, it's that computationally, it's much easier
 to suggest a good move, than to evaluate the position.
 
 This is still true with your paper, it's just that the move suggestion
 has become even better.

It is, but I do not think, that this is necessarily a feature of NN.
NNs might be a good evaluators, but it is much easier to train them for
a move predictor, as it is not easy to get training data sets for an
evaluation function?!

Detlef

P.S.: As we all might be trying to start incorporating NN into our
engines, we might bundle our resources, at least for the first start?!
Maybe exchanging open source software links for NN. I personally would
have started trying NN some time ago, if iOS had OpenCL support, as my
aim is to get a strong iPad go program

 
 
 Stefan
 
 
 
 
 ___
 Computer-go mailing list
 Computer-go@computer-go.org
 http://computer-go.org/mailman/listinfo/computer-go


___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-20 Thread Robert Jasiek

On 20.12.2014 11:21, Detlef Schmicker wrote:
 it is not easy to get training data sets for an evaluation function?!

You seem be asking for abundant data sets, e.g., with triples Position, 
Territory, Influence. Indeed, only dozens are available in the 
literature and need a bit of extra work. Hundreds of available local 
joseki positions do not fit your purpose, e.g., because also the Stone 
Difference matters there. However, I suggest a different approach:


1) One strong player (strong enough to be accurate +-1 point of 
territory when using his known judgement methods) creates a few 
examples, e.g., by taking the existing examples for territory and adding 
the influence stone difference. It should be only one player so that the 
values are created consistently. (If several players are involved, they 
should discuss and agree on their application of known methods.)


2) Code is implemented and produces sample data sets.

3) The same player judges how far off the sample data are from his own 
judgement.


Thereby, training does not require many thousands of data sets. Instead 
it requires much of a strong player's time to accurately judge dozens of 
data sets. In theory, the player could be replaced by program judgement, 
but I wish happy development of the then necessary additional theory and 
algorithms! ;)


As you see, I suggest human/program collaboration to accelerate program 
playing strength. Maybe 9p programs can be created without strong 
players' help, but then we will not understand much in terms of go 
theory why the programs will excel. For getting much understanding of go 
theory from programs, human/program collaboration will be necessary anyway.


--
robert jasiek
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-20 Thread Detlef Schmicker
Hi,

I am still fighting with the NN slang, but why do you zero-padd the
output (page 3: 4 Architecture  Training)?

From all I read up to now, most are zero-padding the input to make the
output fit 19x19?!

Thanks for the great work

Detlef

Am Freitag, den 19.12.2014, 23:17 + schrieb Aja Huang:
 Hi all,
 
 
 We've just submitted our paper to ICLR. We made the draft available at
 http://www.cs.toronto.edu/~cmaddis/pubs/deepgo.pdf
 
 
 
 I hope you enjoy our work. Comments and questions are welcome.
 
 
 Regards,
 Aja
 ___
 Computer-go mailing list
 Computer-go@computer-go.org
 http://computer-go.org/mailman/listinfo/computer-go


___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-20 Thread Álvaro Begué
If you start with a 19x19 grid and you take convolutional filters of size
5x5 (as an example), you'll end up with a board of size 15x15, because a
5x5 box can be placed inside a 19x19 board in 15x15 different locations. We
can get 19x19 outputs if we allow the 5x5 box to be centered on any point,
but then you need to do multiply by values outside of the original 19x19
board. Zero-padding just means you'll use 0 as the value coming from
outside the board. You can either prepare a 23x23 matrix with two rows of
zeros along the edges, or you can just keep the 19x19 input and do your
math carefully so terms outside the board are ignored.



On Sat, Dec 20, 2014 at 12:01 PM, Detlef Schmicker d...@physik.de wrote:

 Hi,

 I am still fighting with the NN slang, but why do you zero-padd the
 output (page 3: 4 Architecture  Training)?

 From all I read up to now, most are zero-padding the input to make the
 output fit 19x19?!

 Thanks for the great work

 Detlef

 Am Freitag, den 19.12.2014, 23:17 + schrieb Aja Huang:
  Hi all,
 
 
  We've just submitted our paper to ICLR. We made the draft available at
  http://www.cs.toronto.edu/~cmaddis/pubs/deepgo.pdf
 
 
 
  I hope you enjoy our work. Comments and questions are welcome.
 
 
  Regards,
  Aja
  ___
  Computer-go mailing list
  Computer-go@computer-go.org
  http://computer-go.org/mailman/listinfo/computer-go


 ___
 Computer-go mailing list
 Computer-go@computer-go.org
 http://computer-go.org/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-20 Thread Mark Wagner
Thanks for sharing. I'm intrigued by your strategy for integrating
with MCTS. It's clear that latency is a challenge for integration. Do
you have any statistics on how many searches new nodes had been
through by the time the predictor comes back with an estimation? Did
you try any prefetching techniques? Because the CNN will guide much of
the search at the frontier of the tree, prefetching should be
tractable.

Did you do any comparisons between your MCTS with and w/o CNN? That's
the direction that many of us will be attempting over the next few
months it seems :)

- Mark

On Sat, Dec 20, 2014 at 10:43 AM, Álvaro Begué alvaro.be...@gmail.com wrote:
 If you start with a 19x19 grid and you take convolutional filters of size
 5x5 (as an example), you'll end up with a board of size 15x15, because a 5x5
 box can be placed inside a 19x19 board in 15x15 different locations. We can
 get 19x19 outputs if we allow the 5x5 box to be centered on any point, but
 then you need to do multiply by values outside of the original 19x19 board.
 Zero-padding just means you'll use 0 as the value coming from outside the
 board. You can either prepare a 23x23 matrix with two rows of zeros along
 the edges, or you can just keep the 19x19 input and do your math carefully
 so terms outside the board are ignored.



 On Sat, Dec 20, 2014 at 12:01 PM, Detlef Schmicker d...@physik.de wrote:

 Hi,

 I am still fighting with the NN slang, but why do you zero-padd the
 output (page 3: 4 Architecture  Training)?

 From all I read up to now, most are zero-padding the input to make the
 output fit 19x19?!

 Thanks for the great work

 Detlef

 Am Freitag, den 19.12.2014, 23:17 + schrieb Aja Huang:
  Hi all,
 
 
  We've just submitted our paper to ICLR. We made the draft available at
  http://www.cs.toronto.edu/~cmaddis/pubs/deepgo.pdf
 
 
 
  I hope you enjoy our work. Comments and questions are welcome.
 
 
  Regards,
  Aja
  ___
  Computer-go mailing list
  Computer-go@computer-go.org
  http://computer-go.org/mailman/listinfo/computer-go


 ___
 Computer-go mailing list
 Computer-go@computer-go.org
 http://computer-go.org/mailman/listinfo/computer-go



 ___
 Computer-go mailing list
 Computer-go@computer-go.org
 http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-20 Thread David Fotland
This would be very similar to the integration I do in Many Faces of Go.  The 
old engine provides a bias to move selection in the tree, but the old engine is 
single threaded and only does a few hundred evaluations per second.  I 
typically get between 40 and 200 playouts through a node before Old Many Faces 
adjusts the biases.

David

 -Original Message-
 From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf
 Of Mark Wagner
 Sent: Saturday, December 20, 2014 11:18 AM
 To: computer-go@computer-go.org
 Subject: Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional
 Neural Networks
 
 Thanks for sharing. I'm intrigued by your strategy for integrating with
 MCTS. It's clear that latency is a challenge for integration. Do you have
 any statistics on how many searches new nodes had been through by the time
 the predictor comes back with an estimation? Did you try any prefetching
 techniques? Because the CNN will guide much of the search at the frontier
 of the tree, prefetching should be tractable.
 
 Did you do any comparisons between your MCTS with and w/o CNN? That's the
 direction that many of us will be attempting over the next few months it
 seems :)
 
 - Mark
 
 On Sat, Dec 20, 2014 at 10:43 AM,  lvaro Begu  alvaro.be...@gmail.com
 wrote:
  If you start with a 19x19 grid and you take convolutional filters of
  size
  5x5 (as an example), you'll end up with a board of size 15x15, because
  a 5x5 box can be placed inside a 19x19 board in 15x15 different
  locations. We can get 19x19 outputs if we allow the 5x5 box to be
  centered on any point, but then you need to do multiply by values
 outside of the original 19x19 board.
  Zero-padding just means you'll use 0 as the value coming from outside
  the board. You can either prepare a 23x23 matrix with two rows of
  zeros along the edges, or you can just keep the 19x19 input and do
  your math carefully so terms outside the board are ignored.
 
 
 
  On Sat, Dec 20, 2014 at 12:01 PM, Detlef Schmicker d...@physik.de
 wrote:
 
  Hi,
 
  I am still fighting with the NN slang, but why do you zero-padd the
  output (page 3: 4 Architecture  Training)?
 
  From all I read up to now, most are zero-padding the input to make
  the output fit 19x19?!
 
  Thanks for the great work
 
  Detlef
 
  Am Freitag, den 19.12.2014, 23:17 + schrieb Aja Huang:
   Hi all,
  
  
   We've just submitted our paper to ICLR. We made the draft available
   at http://www.cs.toronto.edu/~cmaddis/pubs/deepgo.pdf
  
  
  
   I hope you enjoy our work. Comments and questions are welcome.
  
  
   Regards,
   Aja
   ___
   Computer-go mailing list
   Computer-go@computer-go.org
   http://computer-go.org/mailman/listinfo/computer-go
 
 
  ___
  Computer-go mailing list
  Computer-go@computer-go.org
  http://computer-go.org/mailman/listinfo/computer-go
 
 
 
  ___
  Computer-go mailing list
  Computer-go@computer-go.org
  http://computer-go.org/mailman/listinfo/computer-go
 ___
 Computer-go mailing list
 Computer-go@computer-go.org
 http://computer-go.org/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-20 Thread Martin Mueller
I think many of the programs have a mechanism for dealing with “slow” 
knowledge. For example in Fuego, you can call a knowledge function for each 
node that reaches some threshold T of playouts. The new technical challenge is 
dealing with the GPU. I know nothing about it myself, but from what I read it 
seems to work best in batch mode - you don’t want to send single positions for 
GPU evaluation back and forth.

My impression is that we will see a combination of both in the future - 
“normal”, fast knowledge which can be called as initialization in every node, 
and can be learned by Remi Coulom’s method (e.g. Crazy Stone, Aya, Erica) or by 
Wistuba’s (e.g. Fuego). And then, on top of that a mechanism to improve the 
bias using the slower deep networks running on the GPU.

It would be wonderful if some of us could work on an open source network 
evaluator to integrate with Fuego (or pachi or oakfoam). I know that Clark and 
Storkey are planning to open source theirs, but not in the very near future. I 
do not know about the plans of the Google DeepMind group, but they do mention 
something about a strong Go program in their paper :)

Martin

 Thanks for sharing. I'm intrigued by your strategy for integrating
 with MCTS. It's clear that latency is a challenge for integration. Do
 you have any statistics on how many searches new nodes had been
 through by the time the predictor comes back with an estimation? Did
 you try any prefetching techniques? Because the CNN will guide much of
 the search at the frontier of the tree, prefetching should be
 tractable.
 
 Did you do any comparisons between your MCTS with and w/o CNN? That's
 the direction that many of us will be attempting over the next few
 months it seems :)

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-20 Thread hughperkins2
Aja wrote:
 We haven't measured that but I think move history is an important feature 
 since Go is very much about answering the opponent's last move locally 
 (that's also why in Go we have the term tenuki for not answering the last 
 move).

I guess you could get some measure of the importance by looking at the weights? 

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-20 Thread Aja Huang
Hi Mark,

2014-12-20 19:17 GMT+00:00 Mark Wagner wagner.mar...@gmail.com:

 Thanks for sharing. I'm intrigued by your strategy for integrating
 with MCTS. It's clear that latency is a challenge for integration. Do
 you have any statistics on how many searches new nodes had been
 through by the time the predictor comes back with an estimation? Did
 you try any prefetching techniques? Because the CNN will guide much of
 the search at the frontier of the tree, prefetching should be
 tractable.

Did you do any comparisons between your MCTS with and w/o CNN? That's
 the direction that many of us will be attempting over the next few
 months it seems :)


I'm glad you like the paper and would consider to attempt. :)
Thanks for the interesting suggestions.

Regards,
Aja



 - Mark

 On Sat, Dec 20, 2014 at 10:43 AM, Álvaro Begué alvaro.be...@gmail.com
 wrote:
  If you start with a 19x19 grid and you take convolutional filters of size
  5x5 (as an example), you'll end up with a board of size 15x15, because a
 5x5
  box can be placed inside a 19x19 board in 15x15 different locations. We
 can
  get 19x19 outputs if we allow the 5x5 box to be centered on any point,
 but
  then you need to do multiply by values outside of the original 19x19
 board.
  Zero-padding just means you'll use 0 as the value coming from outside the
  board. You can either prepare a 23x23 matrix with two rows of zeros along
  the edges, or you can just keep the 19x19 input and do your math
 carefully
  so terms outside the board are ignored.
 
 
 
  On Sat, Dec 20, 2014 at 12:01 PM, Detlef Schmicker d...@physik.de
 wrote:
 
  Hi,
 
  I am still fighting with the NN slang, but why do you zero-padd the
  output (page 3: 4 Architecture  Training)?
 
  From all I read up to now, most are zero-padding the input to make the
  output fit 19x19?!
 
  Thanks for the great work
 
  Detlef
 
  Am Freitag, den 19.12.2014, 23:17 + schrieb Aja Huang:
   Hi all,
  
  
   We've just submitted our paper to ICLR. We made the draft available at
   http://www.cs.toronto.edu/~cmaddis/pubs/deepgo.pdf
  
  
  
   I hope you enjoy our work. Comments and questions are welcome.
  
  
   Regards,
   Aja
   ___
   Computer-go mailing list
   Computer-go@computer-go.org
   http://computer-go.org/mailman/listinfo/computer-go
 
 
  ___
  Computer-go mailing list
  Computer-go@computer-go.org
  http://computer-go.org/mailman/listinfo/computer-go
 
 
 
  ___
  Computer-go mailing list
  Computer-go@computer-go.org
  http://computer-go.org/mailman/listinfo/computer-go
 ___
 Computer-go mailing list
 Computer-go@computer-go.org
 http://computer-go.org/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-19 Thread Kahn Jonas

Hi Aja


We've just submitted our paper to ICLR. We made the draft available at
http://www.cs.toronto.edu/~cmaddis/pubs/deepgo.pdf

I hope you enjoy our work. Comments and questions are welcome.


I did not look at the go content, on which I'm no expert.
But for the network training, you might be interested in these articles:
«Riemaniann metrics for neural networks I and II» by Yann Ollivier:
http://www.yann-ollivier.org/rech/publs/gradnn.pdf
http://www.yann-ollivier.org/rech/publs/pcnn.pdf

He defines invariant metrics on the parameters, much lighter to compute
than the natural gradient, and then gets usually (very very) much faster
convergence, and in a much more robust way since it does not depend on
parametrisation or the activation function.

Jonas
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-19 Thread Erik van der Werf
On Sat, Dec 20, 2014 at 12:17 AM, Aja Huang ajahu...@google.com wrote:
 We've just submitted our paper to ICLR. We made the draft available at
 http://www.cs.toronto.edu/~cmaddis/pubs/deepgo.pdf

Hi Aja,

Wow, very impressive. In fact so impressive, it seems a bit
suspicious(*)... If this is real then one might wonder what it means
about the game. Can it really be that simple? I always thought that
even with perfect knowledge there should usually still be a fairly
broad set of equal-valued moves. Are we perhaps seeing that most of
the time players just reproduce the same patterns over and over again?

Do I understand correctly that your representation encodes the
location of the last 5 moves? If so, do you have an idea how much
extra performance that provides compared to only the last or not using
it at all?

Thanks for sharing the paper!

Best,
Erik


* I'd really like to see some test results on pro games that are newer
than any of your training data.
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go