Re: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search (value network)

2016-03-13 Thread Stefan Kaitschick
The evaluation is always at least as deep as leaves of the tree. Still, you're right that the earlier in the game, the bigger the inherent uncertainty. One thing I don't understand: if the network does a thumbs up or down, instead of answering with a probability, what is the use of MSE? Why not

Re: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search (value network)

2016-02-04 Thread Detlef Schmicker
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 > One possibility is that 0=loss, 1=win, and the number they are quoting is > sqrt(average((prediction-outcome)^2)). this makes perfectly sense for figure 2. even playouts seem reasonable. But figure 2 is not consistent with the numbers in section

Re: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search (value network)

2016-02-04 Thread Hideki Kato
Detlef Schmicker: <56b385ce.4080...@physik.de>: >-BEGIN PGP SIGNED MESSAGE- >Hash: SHA1 > >Hi, > >I try to reproduce numbers from section 3: training the value network > >On the test set of kgs games the MSE is 0.37. Is it correct, that the >results are represented as +1 and -1? Looks

Re: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search (value network)

2016-02-04 Thread Detlef Schmicker
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 >> Since all positions of all games in the dataset are used, winrate >> should distributes from 0% to 100%, or -1 to 1, not 1. Then, the >> number 70% could be wrong. MSE is 0.37 just means the average >> error is about 0.6, I think. 0.6 in the

Re: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search (value network)

2016-02-04 Thread Álvaro Begué
I am not sure how exactly they define MSE. If you look at the plot in figure 2b, the MSE at the very beginning of the game (where you can't possibly know anything about the result) is 0.50. That suggests it's something else than your [very sensible] interpretation. Álvaro. On Thu, Feb 4, 2016

Re: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search (value network)

2016-02-04 Thread Álvaro Begué
The positions they used are not from high-quality games. They actually include one last move that is completely random. Álvaro. On Thursday, February 4, 2016, Detlef Schmicker wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Hi, > > I try to reproduce numbers from

Re: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search (value network)

2016-02-04 Thread Detlef Schmicker
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi, I try to reproduce numbers from section 3: training the value network On the test set of kgs games the MSE is 0.37. Is it correct, that the results are represented as +1 and -1? This means, that in a typical board position you get a value of

Re: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search (value network)

2016-02-04 Thread Detlef Schmicker
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Thanks for the response, I do not refer to the finaly used data set: in the referred chapter they state, they have used their kgs dataset in a first try (which is in another part of the paper referred to being a 6d+ data set). Am 04.02.2016 um 18:11

Re: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search (value network)

2016-02-04 Thread Álvaro Begué
I re-read the relevant section and I agree with you. Sorry for adding noise to the conversation. Álvaro. On Thu, Feb 4, 2016 at 12:21 PM, Detlef Schmicker wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Thanks for the response, I do not refer to the finaly

Re: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search (value network)

2016-02-04 Thread Michael Markefka
That sounds like it'd be the MSE as classification error of the eventual result. I'm currently not able to look at the paper, but couldn't you use a softmax output layer with two nodes and take the probability distribution as winrate? On Thu, Feb 4, 2016 at 8:34 PM, Álvaro Begué

Re: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search (value network)

2016-02-04 Thread Hideki Kato
I think the error is defined as the difference between the output of the value network and the average output of the simulations done by the policy network (RL) at each position. Hideki Michael Markefka:

Re: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search (value network)

2016-02-04 Thread Álvaro Begué
I just want to see how to get 0.5 for the initial position on the board with some definition. One possibility is that 0=loss, 1=win, and the number they are quoting is sqrt(average((prediction-outcome)^2)). On Thu, Feb 4, 2016 at 3:40 PM, Hideki Kato wrote: > I think