The evaluation is always at least as deep as leaves of the tree.
Still, you're right that the earlier in the game, the bigger the inherent
uncertainty.
One thing I don't understand: if the network does a thumbs up or down,
instead of answering with a probability,
what is the use of MSE? Why not
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
> One possibility is that 0=loss, 1=win, and the number they are
quoting is
> sqrt(average((prediction-outcome)^2)).
this makes perfectly sense for figure 2. even playouts seem reasonable.
But figure 2 is not consistent with the numbers in section
Detlef Schmicker: <56b385ce.4080...@physik.de>:
>-BEGIN PGP SIGNED MESSAGE-
>Hash: SHA1
>
>Hi,
>
>I try to reproduce numbers from section 3: training the value network
>
>On the test set of kgs games the MSE is 0.37. Is it correct, that the
>results are represented as +1 and -1?
Looks
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
>> Since all positions of all games in the dataset are used, winrate
>> should distributes from 0% to 100%, or -1 to 1, not 1. Then, the
>> number 70% could be wrong. MSE is 0.37 just means the average
>> error is about 0.6, I think.
0.6 in the
I am not sure how exactly they define MSE. If you look at the plot in
figure 2b, the MSE at the very beginning of the game (where you can't
possibly know anything about the result) is 0.50. That suggests it's
something else than your [very sensible] interpretation.
Álvaro.
On Thu, Feb 4, 2016
The positions they used are not from high-quality games. They actually
include one last move that is completely random.
Álvaro.
On Thursday, February 4, 2016, Detlef Schmicker wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Hi,
>
> I try to reproduce numbers from
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Hi,
I try to reproduce numbers from section 3: training the value network
On the test set of kgs games the MSE is 0.37. Is it correct, that the
results are represented as +1 and -1?
This means, that in a typical board position you get a value of
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Thanks for the response, I do not refer to the finaly used data set:
in the referred chapter they state, they have used their kgs dataset
in a first try (which is in another part of the paper referred to
being a 6d+ data set).
Am 04.02.2016 um 18:11
I re-read the relevant section and I agree with you. Sorry for adding noise
to the conversation.
Álvaro.
On Thu, Feb 4, 2016 at 12:21 PM, Detlef Schmicker wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Thanks for the response, I do not refer to the finaly
That sounds like it'd be the MSE as classification error of the eventual result.
I'm currently not able to look at the paper, but couldn't you use a
softmax output layer with two nodes and take the probability
distribution as winrate?
On Thu, Feb 4, 2016 at 8:34 PM, Álvaro Begué
I think the error is defined as the difference between the
output of the value network and the average output of the
simulations done by the policy network (RL) at each position.
Hideki
Michael Markefka:
I just want to see how to get 0.5 for the initial position on the board
with some definition.
One possibility is that 0=loss, 1=win, and the number they are quoting is
sqrt(average((prediction-outcome)^2)).
On Thu, Feb 4, 2016 at 3:40 PM, Hideki Kato wrote:
> I think
12 matches
Mail list logo