Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
Let's be pragmatic - humans heavily use the information about the last move too. If they take a while, they don't need to know the last move of the opponent when reviewing a position, but when reading a tactical sequence the previous move in the sequence is essential piece of information. -- Petr Baudis So maybe there is a greater justification for using last move info in the playout than in the tree? Stefan ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
Hi Hugh, On Fri, Dec 26, 2014 at 9:49 AM, Hugh Perkins hughperk...@gmail.com wrote: Estimated total number of parameters approx = 12 layers * 128 filters * 128 previous featuremaps * 3 * 3 filtersize = 1.8 million But you say 2.3 million. It's similar, so seems feature maps are fully connected to lower level feature maps, but I'm not sure where the extra 500,000 parameters should come from? You may have forgotten to include the position dependent biases. This is how I computed the number of parameters 1st layer + 11*middle layers + final layer + 12*middle layer bias + output bias 5*5*36*128 + 3*3*128*128*11 + 3*3*128*2 + 128*19*19*12 + 2*19*19 = 2,294,738 2. Symmetry Aja, you say in section 5.1 that adding symmetry does not modify the accuracy, neither higher or lower. Since adding symmetry presumably reduces the number of weights, and therefore increases learning speed, why did you thus decide not to implement symmetry? We were doing exploratory work that optimized performance not training time, so we don't know how symmetry affects training time. In terms of performance it seems not have an effect. Aja ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
Hi, as I want to by graphic card for CNN: do I need double precision performance? I give caffe (http://caffe.berkeleyvision.org/) a try, and as far as I understood most is done in single precision?! You get comparable single precision performance NVIDA (as caffe uses CUDA I look for NVIDA) for about 340$ but the double precision performance is 10x smaller than the 1000$ cards thanks a lot Detlef Am Mittwoch, den 24.12.2014, 12:14 +0800 schrieb hughperkins2: Whilst its technically true that you can use an nn with one hidden layer to learn the same function as a deeper net, you might need a combinatorally large number of nodes :-) scaling learning algorithms towards ai, by bengio and lecunn, 2007, makes a convincing case along these lines. ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
No, you don't need double precision at all. Álvaro. On Thu, Dec 25, 2014 at 5:00 AM, Detlef Schmicker d...@physik.de wrote: Hi, as I want to by graphic card for CNN: do I need double precision performance? I give caffe (http://caffe.berkeleyvision.org/) a try, and as far as I understood most is done in single precision?! You get comparable single precision performance NVIDA (as caffe uses CUDA I look for NVIDA) for about 340$ but the double precision performance is 10x smaller than the 1000$ cards thanks a lot Detlef Am Mittwoch, den 24.12.2014, 12:14 +0800 schrieb hughperkins2: Whilst its technically true that you can use an nn with one hidden layer to learn the same function as a deeper net, you might need a combinatorally large number of nodes :-) scaling learning algorithms towards ai, by bengio and lecunn, 2007, makes a convincing case along these lines. ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
as I want to by graphic card for CNN: do I need double precision performance? Personally, i was thinking of experimenting with ints, bytes, and shorts, even less precise than singles :-)___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
You are going to be computing gradients of functions, and most people find it easier to think about these things using a type that roughly corresponds to the notion of real number. You can use a fixed-point representation of reals, which uses ints in the end, but then you have to worry about what scale to use, so you get enough precision but you don't run the risk of overflowing. The only reason I might consider a fixed-point representation is to achieve reproducibility of results. On Thu, Dec 25, 2014 at 5:44 AM, hughperkins2 hughperki...@gmail.com wrote: as I want to by graphic card for CNN: do I need double precision performance? Personally, i was thinking of experimenting with ints, bytes, and shorts, even less precise than singles :-) ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
You can do some GPU experiments on Amazon AWS before you buy. 65 cents per hour David http://aws.amazon.com/ec2/instance-types/ G2 This family includes G2 instances intended for graphics and general purpose GPU compute applications. Features: High Frequency Intel Xeon E5-2670 (Sandy Bridge) Processors High-performance NVIDIA GPU with 1,536 CUDA cores and 4GB of video memory GPU Instances - Current Generation g2.2xlarge $0.650 per Hour -Original Message- From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Detlef Schmicker Sent: Thursday, December 25, 2014 2:00 AM To: computer-go@computer-go.org Subject: Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks Hi, as I want to by graphic card for CNN: do I need double precision performance? I give caffe (http://caffe.berkeleyvision.org/) a try, and as far as I understood most is done in single precision?! You get comparable single precision performance NVIDA (as caffe uses CUDA I look for NVIDA) for about 340$ but the double precision performance is 10x smaller than the 1000$ cards thanks a lot Detlef Am Mittwoch, den 24.12.2014, 12:14 +0800 schrieb hughperkins2: Whilst its technically true that you can use an nn with one hidden layer to learn the same function as a deeper net, you might need a combinatorally large number of nodes :-) scaling learning algorithms towards ai, by bengio and lecunn, 2007, makes a convincing case along these lines. ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
Hi Aja, Couple of questions: 1. connectivity, number of parameters Just to check, each filter connects to all the feature maps below it, is that right? I tried to check that by ball-park estimating number of parameters in that case, and comparing to the section paragraph in your section 4. And that seems to support that hypothesis. But actually my estimate is for some reason under-estimating the number of parameters, by about 20%: Estimated total number of parameters approx = 12 layers * 128 filters * 128 previous featuremaps * 3 * 3 filtersize = 1.8 million But you say 2.3 million. It's similar, so seems feature maps are fully connected to lower level feature maps, but I'm not sure where the extra 500,000 parameters should come from? 2. Symmetry Aja, you say in section 5.1 that adding symmetry does not modify the accuracy, neither higher or lower. Since adding symmetry presumably reduces the number of weights, and therefore increases learning speed, why did you thus decide not to implement symmetry? Hugh ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
This is my guess as to what the number of parameters actually is: First layer: 128 * (5*5*36 + 19*19) (128 filters of size 5x5 on 36 layers of input, position-dependent biases) 11 hidden layers: 11 * 128 * (3*3*128 + 19*19) (128 filters of size 3x3 on 128 layers of input, position-dependent biases) Final layer: 2 *(3*3*128 + 19*19) (2 filters of size 3x3 on 128 layers of input, position-dependent biases) Total number of parameters: 2294738 Did I get that right? I have the same question about the use of symmetry as Hugh. Álvaro. On Thu, Dec 25, 2014 at 8:49 PM, Hugh Perkins hughperk...@gmail.com wrote: Hi Aja, Couple of questions: 1. connectivity, number of parameters Just to check, each filter connects to all the feature maps below it, is that right? I tried to check that by ball-park estimating number of parameters in that case, and comparing to the section paragraph in your section 4. And that seems to support that hypothesis. But actually my estimate is for some reason under-estimating the number of parameters, by about 20%: Estimated total number of parameters approx = 12 layers * 128 filters * 128 previous featuremaps * 3 * 3 filtersize = 1.8 million But you say 2.3 million. It's similar, so seems feature maps are fully connected to lower level feature maps, but I'm not sure where the extra 500,000 parameters should come from? 2. Symmetry Aja, you say in section 5.1 that adding symmetry does not modify the accuracy, neither higher or lower. Since adding symmetry presumably reduces the number of weights, and therefore increases learning speed, why did you thus decide not to implement symmetry? Hugh ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
Hi Aja, Thanks for a game and report. I saw sgf, CNN can play ko fight. great. our best CNN is about 220 to 310 Elo stronger which is consistent Deeper network and rich info makes +300 Elo? impressive. Aja, if your CNN+MCTS use Erica's playout, how strong will it be? I think it will be contender for strongest program. I also wonder Fuego could release latest version with 1.2, and use odd number 1.3.x for development. Regards, Hiroshi Yamashita ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
Hiroshi Yamashita: 37E4294EAD9142EA84D1031F3E1E9C7C@x60: Hi Aja, Thanks for a game and report. I saw sgf, CNN can play ko fight. great. our best CNN is about 220 to 310 Elo stronger which is consistent Deeper network and rich info makes +300 Elo? impressive. Aja, if your CNN+MCTS use Erica's playout, how strong will it be? I think it will be contender for strongest program. The playing strength of an MCTS program is dominated by the correctness of the simulations, especially of LD. Prior knowledge helps a little. David pointed out after the first Densei-sen (almost three years ago): All mcts programs have trouble with the positions near the end. The group in the center has miai for two eyes. Same for the group at the top. The upper left side group has one big eye shape. For all three groups the playouts sometimes kill them. The black stones are pretty solid, so the playouts let them survivie. SO even at the end, zen has 50% win rate, MFGO has 60%, and pache has 70% win rate for blasck. Without improving the correctness of the simulations, MCTS programs can't go up to next stage. Hideki Hideki -- Hideki Kato mailto:hideki_ka...@ybb.ne.jp ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
I thought that any layers beyond 3 were irrelevant. Probably I'm subsuming your nn into what I learned about nn's and didn't read anything carefully enough. Can you help correct me? s. On Dec 23, 2014 6:47 AM, Aja Huang ajahu...@google.com wrote: On Mon, Dec 22, 2014 at 12:38 PM, David Silver davidstarsil...@gmail.com wrote: we'll evaluate against Fuego 1.1 and post the results. I quickly tested our 12-layer CNN against Fuego 1.1 with 5 secs and 10 secs per move, 2 threads. The hardware is Intel(R) Xeon(R) CPU E5-2687W 0 @ 3.10GHz. 5 secs per move 12-layer CNN scored 55.8% ±5.4 10 secs per move 12-layer CNN scored 32.9% ±3.8 Fuego1.1 is clearly much weaker than the latest svn release. And interestingly, the network is actually as strong as Fuego 1.1 with 5 secs per move. Since Clark and Storkey's CNN scored 12% against Fuego 1.1 running on a weaker hardware, our best CNN is about 220 to 310 Elo stronger which is consistent to the results against GnuGo. Regards, Aja ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
A 3-layer network (input, hidden, output) is sufficient to be a universal function approximator, so from a theoretical perspective only 3 layers are necessary. But the gap between theoretical and practical is quite large. The CNN architecture builds in translation invariance and sensitivity to local phenomena. That gives it a big advantage (on a per distinct weight basis) over the flat architecture. Additionally, the input layers of these CNN designs are very important. Compared to a stone-by-stone representation, the use of high level concepts in the input layer allows the network to devote its capacity to advanced concepts rather than synthesizing basic concepts. From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of uurtamo . Sent: Tuesday, December 23, 2014 7:34 PM To: computer-go Subject: Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks I thought that any layers beyond 3 were irrelevant. Probably I'm subsuming your nn into what I learned about nn's and didn't read anything carefully enough. Can you help correct me? s. On Dec 23, 2014 6:47 AM, Aja Huang ajahu...@google.com mailto:ajahu...@google.com wrote: On Mon, Dec 22, 2014 at 12:38 PM, David Silver davidstarsil...@gmail.com mailto:davidstarsil...@gmail.com wrote: we'll evaluate against Fuego 1.1 and post the results. I quickly tested our 12-layer CNN against Fuego 1.1 with 5 secs and 10 secs per move, 2 threads. The hardware is Intel(R) Xeon(R) CPU E5-2687W 0 @ 3.10GHz. 5 secs per move 12-layer CNN scored 55.8% ±5.4 10 secs per move 12-layer CNN scored 32.9% ±3.8 Fuego1.1 is clearly much weaker than the latest svn release. And interestingly, the network is actually as strong as Fuego 1.1 with 5 secs per move. Since Clark and Storkey's CNN scored 12% against Fuego 1.1 running on a weaker hardware, our best CNN is about 220 to 310 Elo stronger which is consistent to the results against GnuGo. Regards, Aja ___ Computer-go mailing list Computer-go@computer-go.org mailto:Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
Whilst its technically true that you can use an nn with one hidden layer to learn the same function as a deeper net, you might need a combinatorally large number of nodes :-) scaling learning algorithms towards ai, by bengio and lecunn, 2007, makes a convincing case along these lines. ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
Hi Martin - Would you be willing to share some of the sgf game records played by your network with the community? I tried to replay the game record in your paper, but got stuck since it does not show any of the moves that got captured. Sorry about that, we will correct the figure and repost. In the meanwhile Aja will post the .sgf for that game. Also, thanks for noticing that we tested against a stronger version of Fuego than Clark and Storkey, we'll evaluate against Fuego 1.1 and post the results. Unfortunately, we only have approval to release the material in the paper, so we can't really give any further data :-( One more thing, Aja said he was tired when he measured his own performance on KGS predictions (working too hard on this paper!) So it would be great to get better statistics on how humans really do at predicting the next move. Does anyone want to measure their own performance, say on 200 randomly chosen positions from the KGS data? - Do you know how large is the effect from using the extra features that are not in the paper by Clarke and Storkey, i.e. the last move info and the extra tactics? As a related question, would you get an OK result if you just zeroed out some inputs in the existing net, or would you need to re-train a new network from fewer inputs. We trained our networks before we knew about Clark and Storkey's results, so we haven't had a chance to evaluate the differences between the approaches. But it's well known that last move info makes a big difference to predictive performance, so I'd guess they would already be close to 50% predictive accuracy if they included those features. - Is there a way to simplify the final network so that it is faster to compute and/or easier to understand? Is there something computed, maybe on an intermediate layer, that would be usable as a new feature in itself? This is an interesting idea, but so far we only focused on building a large and deep enough network to represent Go knowledge at all. Cheers Dave ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
Last move info is a strange beast, isn't it? I mean, except for ko captures, it doesn't really add information to the position. The correct prediction rate is such an obvious metric, but maybe prediction shouldn't be improved at any price. To a certain degree, last move info is a kind of self-delusion. A predictor that does well without it should be a lot more robust, even if the percentages are poorer. Stefan ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
Last move info is a cheap hint for an instable area (unless it is a defense move). Thomas On Mon, 22 Dec 2014, Stefan Kaitschick wrote: Last move info is a strange beast, isn't it? I mean, except for ko captures, it doesn't really add information to the position. The correct prediction rate is such an obvious metric, but maybe prediction shouldn't be improved at any price. To a certain degree, last move info is a kind of self-delusion. A predictor that does well without it should be a lot more robust, even if the percentages are poorer. Stefan ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
On Mon, Dec 22, 2014 at 03:45:47PM +0100, Stefan Kaitschick wrote: Last move info is a strange beast, isn't it? I mean, except for ko captures, it doesn't really add information to the position. The correct prediction rate is such an obvious metric, but maybe prediction shouldn't be improved at any price. To a certain degree, last move info is a kind of self-delusion. A predictor that does well without it should be a lot more robust, even if the percentages are poorer. Let's be pragmatic - humans heavily use the information about the last move too. If they take a while, they don't need to know the last move of the opponent when reviewing a position, but when reading a tactical sequence the previous move in the sequence is essential piece of information. -- Petr Baudis If you do not work on an important problem, it's unlikely you'll do important work. -- R. Hamming http://www.cs.virginia.edu/~robins/YouAndYourResearch.html ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
Aja and co-authors, first of all, congratulations on outstanding results! I have some questions: - Would you be willing to share some of the sgf game records played by your network with the community? I tried to replay the game record in your paper, but got stuck since it does not show any of the moves that got captured. - Do you know how large is the effect from using the extra features that are not in the paper by Clarke and Storkey, i.e. the last move info and the extra tactics? As a related question, would you get an OK result if you just zeroed out some inputs in the existing net, or would you need to re-train a new network from fewer inputs. - Is there a way to simplify the final network so that it is faster to compute and/or easier to understand? Is there something computed, maybe on an intermediate layer, that would be usable as a new feature in itself? Thanks Martin ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
[Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
On Fri Dec 19 23:17:23 UTC 2014, Aja Huang wrote: We've just submitted our paper to ICLR. We made the draft available at http://www.cs.toronto.edu/~cmaddis/pubs/deepgo.pdf Cool... just out of curiosity, did a back-of-an-envelope estimation of the cost of training your and Clark and Storkey's network, if renting time on AWS GPU instances and came up with: - Clark and Storkey: 125 usd (4 days * 2 instances * 0.65usd/hour) - Yours: 2025usd(cost of Clark and Storkey * 25/7 epochs * 29.4/14.7 action-pairs * 12/8 layers) Probably a bit high for me personally to just spend one weekend for fun, but not outrageous at all in fact, if the same technique was being used by an organization. ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
Great work. Looks like the age of nn is here. How does this compare in computation time to a heavy MC move generator? One very minor quibble, I feel like a nag for even mentioning it: You write The most frequently cited reason for the difficulty of Go, compared to games such as Chess, Scrabble or Shogi, is the difficulty of constructing an evaluation function that can differentiate good moves from bad in a given position. If MC has shown anything, it's that computationally, it's much easier to suggest a good move, than to evaluate the position. This is still true with your paper, it's just that the move suggestion has become even better. Stefan ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
On 20.12.2014 09:43, Stefan Kaitschick wrote: If MC has shown anything, it's that computationally, it's much easier to suggest a good move, than to evaluate the position. Such can only mean an improper understanding of positional judgement. Positional judgement depends on reading (or MC simulation of reading) but the reading has a much smaller computational complexity because localisation and quiescience apply. The major aspects of positional judgement are territory and influence. Evaluating influence is much easier than evaluating territory if one uses a partial influence concept: influence stone difference. Its major difficulty is the knowledge of which stones are alive or not, however, MC simulations applied to outside stones should be able to assess such with reasonable certainty fairly quickly. Hence, the major work of positional judgement is assessment of territory. See my book Positional Judgement 1 - Territory for that. By designing (heuristically or using a low level expert system) MC for its methods, territorial positional judgement by MC should be much faster than ordinary MC because much fewer simulations should do. However, it is not as elegant as ordinary MC because some expert knowledge is necessary or must be approximated heuristically. Needless to say, keep the computational complexity of this expert knowledge low. -- robert jasiek ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
Am Samstag, den 20.12.2014, 09:43 +0100 schrieb Stefan Kaitschick: Great work. Looks like the age of nn is here. How does this compare in computation time to a heavy MC move generator? One very minor quibble, I feel like a nag for even mentioning it: You write The most frequently cited reason for the difficulty of Go, compared to games such as Chess, Scrabble or Shogi, is the difficulty of constructing an evaluation function that can differentiate good moves from bad in a given position. If MC has shown anything, it's that computationally, it's much easier to suggest a good move, than to evaluate the position. This is still true with your paper, it's just that the move suggestion has become even better. It is, but I do not think, that this is necessarily a feature of NN. NNs might be a good evaluators, but it is much easier to train them for a move predictor, as it is not easy to get training data sets for an evaluation function?! Detlef P.S.: As we all might be trying to start incorporating NN into our engines, we might bundle our resources, at least for the first start?! Maybe exchanging open source software links for NN. I personally would have started trying NN some time ago, if iOS had OpenCL support, as my aim is to get a strong iPad go program Stefan ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
On 20.12.2014 11:21, Detlef Schmicker wrote: it is not easy to get training data sets for an evaluation function?! You seem be asking for abundant data sets, e.g., with triples Position, Territory, Influence. Indeed, only dozens are available in the literature and need a bit of extra work. Hundreds of available local joseki positions do not fit your purpose, e.g., because also the Stone Difference matters there. However, I suggest a different approach: 1) One strong player (strong enough to be accurate +-1 point of territory when using his known judgement methods) creates a few examples, e.g., by taking the existing examples for territory and adding the influence stone difference. It should be only one player so that the values are created consistently. (If several players are involved, they should discuss and agree on their application of known methods.) 2) Code is implemented and produces sample data sets. 3) The same player judges how far off the sample data are from his own judgement. Thereby, training does not require many thousands of data sets. Instead it requires much of a strong player's time to accurately judge dozens of data sets. In theory, the player could be replaced by program judgement, but I wish happy development of the then necessary additional theory and algorithms! ;) As you see, I suggest human/program collaboration to accelerate program playing strength. Maybe 9p programs can be created without strong players' help, but then we will not understand much in terms of go theory why the programs will excel. For getting much understanding of go theory from programs, human/program collaboration will be necessary anyway. -- robert jasiek ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
Hi, I am still fighting with the NN slang, but why do you zero-padd the output (page 3: 4 Architecture Training)? From all I read up to now, most are zero-padding the input to make the output fit 19x19?! Thanks for the great work Detlef Am Freitag, den 19.12.2014, 23:17 + schrieb Aja Huang: Hi all, We've just submitted our paper to ICLR. We made the draft available at http://www.cs.toronto.edu/~cmaddis/pubs/deepgo.pdf I hope you enjoy our work. Comments and questions are welcome. Regards, Aja ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
If you start with a 19x19 grid and you take convolutional filters of size 5x5 (as an example), you'll end up with a board of size 15x15, because a 5x5 box can be placed inside a 19x19 board in 15x15 different locations. We can get 19x19 outputs if we allow the 5x5 box to be centered on any point, but then you need to do multiply by values outside of the original 19x19 board. Zero-padding just means you'll use 0 as the value coming from outside the board. You can either prepare a 23x23 matrix with two rows of zeros along the edges, or you can just keep the 19x19 input and do your math carefully so terms outside the board are ignored. On Sat, Dec 20, 2014 at 12:01 PM, Detlef Schmicker d...@physik.de wrote: Hi, I am still fighting with the NN slang, but why do you zero-padd the output (page 3: 4 Architecture Training)? From all I read up to now, most are zero-padding the input to make the output fit 19x19?! Thanks for the great work Detlef Am Freitag, den 19.12.2014, 23:17 + schrieb Aja Huang: Hi all, We've just submitted our paper to ICLR. We made the draft available at http://www.cs.toronto.edu/~cmaddis/pubs/deepgo.pdf I hope you enjoy our work. Comments and questions are welcome. Regards, Aja ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
Thanks for sharing. I'm intrigued by your strategy for integrating with MCTS. It's clear that latency is a challenge for integration. Do you have any statistics on how many searches new nodes had been through by the time the predictor comes back with an estimation? Did you try any prefetching techniques? Because the CNN will guide much of the search at the frontier of the tree, prefetching should be tractable. Did you do any comparisons between your MCTS with and w/o CNN? That's the direction that many of us will be attempting over the next few months it seems :) - Mark On Sat, Dec 20, 2014 at 10:43 AM, Álvaro Begué alvaro.be...@gmail.com wrote: If you start with a 19x19 grid and you take convolutional filters of size 5x5 (as an example), you'll end up with a board of size 15x15, because a 5x5 box can be placed inside a 19x19 board in 15x15 different locations. We can get 19x19 outputs if we allow the 5x5 box to be centered on any point, but then you need to do multiply by values outside of the original 19x19 board. Zero-padding just means you'll use 0 as the value coming from outside the board. You can either prepare a 23x23 matrix with two rows of zeros along the edges, or you can just keep the 19x19 input and do your math carefully so terms outside the board are ignored. On Sat, Dec 20, 2014 at 12:01 PM, Detlef Schmicker d...@physik.de wrote: Hi, I am still fighting with the NN slang, but why do you zero-padd the output (page 3: 4 Architecture Training)? From all I read up to now, most are zero-padding the input to make the output fit 19x19?! Thanks for the great work Detlef Am Freitag, den 19.12.2014, 23:17 + schrieb Aja Huang: Hi all, We've just submitted our paper to ICLR. We made the draft available at http://www.cs.toronto.edu/~cmaddis/pubs/deepgo.pdf I hope you enjoy our work. Comments and questions are welcome. Regards, Aja ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
This would be very similar to the integration I do in Many Faces of Go. The old engine provides a bias to move selection in the tree, but the old engine is single threaded and only does a few hundred evaluations per second. I typically get between 40 and 200 playouts through a node before Old Many Faces adjusts the biases. David -Original Message- From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Mark Wagner Sent: Saturday, December 20, 2014 11:18 AM To: computer-go@computer-go.org Subject: Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks Thanks for sharing. I'm intrigued by your strategy for integrating with MCTS. It's clear that latency is a challenge for integration. Do you have any statistics on how many searches new nodes had been through by the time the predictor comes back with an estimation? Did you try any prefetching techniques? Because the CNN will guide much of the search at the frontier of the tree, prefetching should be tractable. Did you do any comparisons between your MCTS with and w/o CNN? That's the direction that many of us will be attempting over the next few months it seems :) - Mark On Sat, Dec 20, 2014 at 10:43 AM, lvaro Begu alvaro.be...@gmail.com wrote: If you start with a 19x19 grid and you take convolutional filters of size 5x5 (as an example), you'll end up with a board of size 15x15, because a 5x5 box can be placed inside a 19x19 board in 15x15 different locations. We can get 19x19 outputs if we allow the 5x5 box to be centered on any point, but then you need to do multiply by values outside of the original 19x19 board. Zero-padding just means you'll use 0 as the value coming from outside the board. You can either prepare a 23x23 matrix with two rows of zeros along the edges, or you can just keep the 19x19 input and do your math carefully so terms outside the board are ignored. On Sat, Dec 20, 2014 at 12:01 PM, Detlef Schmicker d...@physik.de wrote: Hi, I am still fighting with the NN slang, but why do you zero-padd the output (page 3: 4 Architecture Training)? From all I read up to now, most are zero-padding the input to make the output fit 19x19?! Thanks for the great work Detlef Am Freitag, den 19.12.2014, 23:17 + schrieb Aja Huang: Hi all, We've just submitted our paper to ICLR. We made the draft available at http://www.cs.toronto.edu/~cmaddis/pubs/deepgo.pdf I hope you enjoy our work. Comments and questions are welcome. Regards, Aja ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
I think many of the programs have a mechanism for dealing with “slow” knowledge. For example in Fuego, you can call a knowledge function for each node that reaches some threshold T of playouts. The new technical challenge is dealing with the GPU. I know nothing about it myself, but from what I read it seems to work best in batch mode - you don’t want to send single positions for GPU evaluation back and forth. My impression is that we will see a combination of both in the future - “normal”, fast knowledge which can be called as initialization in every node, and can be learned by Remi Coulom’s method (e.g. Crazy Stone, Aya, Erica) or by Wistuba’s (e.g. Fuego). And then, on top of that a mechanism to improve the bias using the slower deep networks running on the GPU. It would be wonderful if some of us could work on an open source network evaluator to integrate with Fuego (or pachi or oakfoam). I know that Clark and Storkey are planning to open source theirs, but not in the very near future. I do not know about the plans of the Google DeepMind group, but they do mention something about a strong Go program in their paper :) Martin Thanks for sharing. I'm intrigued by your strategy for integrating with MCTS. It's clear that latency is a challenge for integration. Do you have any statistics on how many searches new nodes had been through by the time the predictor comes back with an estimation? Did you try any prefetching techniques? Because the CNN will guide much of the search at the frontier of the tree, prefetching should be tractable. Did you do any comparisons between your MCTS with and w/o CNN? That's the direction that many of us will be attempting over the next few months it seems :) ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
Aja wrote: We haven't measured that but I think move history is an important feature since Go is very much about answering the opponent's last move locally (that's also why in Go we have the term tenuki for not answering the last move). I guess you could get some measure of the importance by looking at the weights? ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
Hi Mark, 2014-12-20 19:17 GMT+00:00 Mark Wagner wagner.mar...@gmail.com: Thanks for sharing. I'm intrigued by your strategy for integrating with MCTS. It's clear that latency is a challenge for integration. Do you have any statistics on how many searches new nodes had been through by the time the predictor comes back with an estimation? Did you try any prefetching techniques? Because the CNN will guide much of the search at the frontier of the tree, prefetching should be tractable. Did you do any comparisons between your MCTS with and w/o CNN? That's the direction that many of us will be attempting over the next few months it seems :) I'm glad you like the paper and would consider to attempt. :) Thanks for the interesting suggestions. Regards, Aja - Mark On Sat, Dec 20, 2014 at 10:43 AM, Álvaro Begué alvaro.be...@gmail.com wrote: If you start with a 19x19 grid and you take convolutional filters of size 5x5 (as an example), you'll end up with a board of size 15x15, because a 5x5 box can be placed inside a 19x19 board in 15x15 different locations. We can get 19x19 outputs if we allow the 5x5 box to be centered on any point, but then you need to do multiply by values outside of the original 19x19 board. Zero-padding just means you'll use 0 as the value coming from outside the board. You can either prepare a 23x23 matrix with two rows of zeros along the edges, or you can just keep the 19x19 input and do your math carefully so terms outside the board are ignored. On Sat, Dec 20, 2014 at 12:01 PM, Detlef Schmicker d...@physik.de wrote: Hi, I am still fighting with the NN slang, but why do you zero-padd the output (page 3: 4 Architecture Training)? From all I read up to now, most are zero-padding the input to make the output fit 19x19?! Thanks for the great work Detlef Am Freitag, den 19.12.2014, 23:17 + schrieb Aja Huang: Hi all, We've just submitted our paper to ICLR. We made the draft available at http://www.cs.toronto.edu/~cmaddis/pubs/deepgo.pdf I hope you enjoy our work. Comments and questions are welcome. Regards, Aja ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
Hi Aja We've just submitted our paper to ICLR. We made the draft available at http://www.cs.toronto.edu/~cmaddis/pubs/deepgo.pdf I hope you enjoy our work. Comments and questions are welcome. I did not look at the go content, on which I'm no expert. But for the network training, you might be interested in these articles: «Riemaniann metrics for neural networks I and II» by Yann Ollivier: http://www.yann-ollivier.org/rech/publs/gradnn.pdf http://www.yann-ollivier.org/rech/publs/pcnn.pdf He defines invariant metrics on the parameters, much lighter to compute than the natural gradient, and then gets usually (very very) much faster convergence, and in a much more robust way since it does not depend on parametrisation or the activation function. Jonas ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks
On Sat, Dec 20, 2014 at 12:17 AM, Aja Huang ajahu...@google.com wrote: We've just submitted our paper to ICLR. We made the draft available at http://www.cs.toronto.edu/~cmaddis/pubs/deepgo.pdf Hi Aja, Wow, very impressive. In fact so impressive, it seems a bit suspicious(*)... If this is real then one might wonder what it means about the game. Can it really be that simple? I always thought that even with perfect knowledge there should usually still be a fairly broad set of equal-valued moves. Are we perhaps seeing that most of the time players just reproduce the same patterns over and over again? Do I understand correctly that your representation encodes the location of the last 5 moves? If so, do you have an idea how much extra performance that provides compared to only the last or not using it at all? Thanks for sharing the paper! Best, Erik * I'd really like to see some test results on pro games that are newer than any of your training data. ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go