Re: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search (value network)

Hideki Kato Thu, 04 Feb 2016 12:41:12 -0800

I think the error is defined as the difference between the 
output of the value network and the average output of the 
simulations done by the policy network (RL) at each position.


Hideki

Michael Markefka: 
<CAJg7PAN9G2_htRs0mfKuFi82yef7gNFCsouE4ez4f37_pK=k...@mail.gmail.com>: 
>That sounds like it'd be the MSE as classification error of the eventual 
>result.

>

>I'm currently not able to look at the paper, but couldn't you use a

>softmax output layer with two nodes and take the probability

>distribution as winrate?

>

>On Thu, Feb 4, 2016 at 8:34 PM, Álvaro Begué <alvaro.be...@gmail.com> wrote:

>> I am not sure how exactly they define MSE. If you look at the plot in figure

>> 2b, the MSE at the very beginning of the game (where you can't possibly know

>> anything about the result) is 0.50. That suggests it's something else than

>> your [very sensible] interpretation.

>>

>> Álvaro.

>>

>>

>>

>> On Thu, Feb 4, 2016 at 2:24 PM, Detlef Schmicker <d...@physik.de> wrote:

>>>

>>> -----BEGIN PGP SIGNED MESSAGE-----

>>> Hash: SHA1

>>>

>>> >> Since all positions of all games in the dataset are used, winrate

>>> >> should distributes from 0% to 100%, or -1 to 1, not 1. Then, the

>>> >> number 70% could be wrong.  MSE is 0.37 just means the average

>>> >> error is about 0.6, I think.

>>>

>>> 0.6 in the range of -1 to 1,

>>>

>>> which means -1 (eg lost by b) games -> typical value -0.4

>>> and +1 games -> typical value +0.4 of the value network

>>>

>>> if I rescale -1 to +1 to  0 - 100% (eg winrate for b) than I get about

>>> 30% for games lost by b and 70% for games won by B?

>>>

>>> Detlef

>>>

>>>

>>> Am 04.02.2016 um 20:10 schrieb Hideki Kato:

>>> > Detlef Schmicker: <56b385ce.4080...@physik.de>: Hi,

>>> >

>>> > I try to reproduce numbers from section 3: training the value

>>> > network

>>> >

>>> > On the test set of kgs games the MSE is 0.37. Is it correct, that

>>> > the results are represented as +1 and -1?

>>> >

>>> >> Looks correct.

>>> >

>>> > This means, that in a typical board position you get a value of

>>> > 1-sqrt(0.37) = 0.4  --> this would correspond to a win rate of 70%

>>> > ?!

>>> >

>>> >> Since all positions of all games in the dataset are used, winrate

>>> >> should distributes from 0% to 100%, or -1 to 1, not 1. Then, the

>>> >> number 70% could be wrong.  MSE is 0.37 just means the average

>>> >> error is about 0.6, I think.

>>> >

>>> >> Hideki

>>> >

>>> > Is it really true, that a typical kgs 6d+ position is judeged with

>>> > such a high win rate (even though it it is overfitted, so the test

>>> > set number is to bad!), or do I misinterpret the MSE calculation?!

>>> >

>>> > Any help would be great,

>>> >

>>> > Detlef

>>> >

>>> > Am 27.01.2016 um 19:46 schrieb Aja Huang:

>>> >>>> Hi all,

>>> >>>>

>>> >>>> We are very excited to announce that our Go program, AlphaGo,

>>> >>>> has beaten a professional player for the first time. AlphaGo

>>> >>>> beat the European champion Fan Hui by 5 games to 0. We hope

>>> >>>> you enjoy our paper, published in Nature today. The paper and

>>> >>>> all the games can be found here:

>>> >>>>

>>> >>>> http://www.deepmind.com/alpha-go.html

>>> >>>>

>>> >>>> AlphaGo will be competing in a match against Lee Sedol in

>>> >>>> Seoul, this March, to see whether we finally have a Go

>>> >>>> program that is stronger than any human!

>>> >>>>

>>> >>>> Aja

>>> >>>>

>>> >>>> PS I am very busy preparing AlphaGo for the match, so

>>> >>>> apologies in advance if I cannot respond to all questions

>>> >>>> about AlphaGo.

>>> >>>>

>>> >>>>

>>> >>>>

>>> >>>> _______________________________________________ Computer-go

>>> >>>> mailing list Computer-go@computer-go.org

>>> >>>> http://computer-go.org/mailman/listinfo/computer-go

>>> >>>>

>>> >> _______________________________________________ Computer-go

>>> >> mailing list Computer-go@computer-go.org

>>> >> http://computer-go.org/mailman/listinfo/computer-go

>>> -----BEGIN PGP SIGNATURE-----

>>> Version: GnuPG v2.0.22 (GNU/Linux)

>>>

>>> iQIcBAEBAgAGBQJWs6WFAAoJEInWdHg+Znf4eTsP/21vawWsmrZkDuAjTkwbKB2S

>>> 7LpLi3huuLlepkulmUr3rIUvDHhTOwD04pDHjjVrIDBB1k3JjQQ/YKWDfijQQYu6

>>> ZI1GK55pglUPH+uc+rxfM89ziJwCQrza71l5XU+5ffcBwxRjeAL+D1fGGyr0CPlv

>>> WKR/Q07XDslXhwlk2O6NDpd80d38dMlMV9lO4s8Zf3Y+o8WJOuyEdybRpg8VOibq

>>> o59RCAWUiVkTs++iSihcIrVAwGnLtkPyMJ/lBN6zMyZQeuM0dyYL+IAoMH9IdCLQ

>>> 0jpbtJEqtSsp1ZjWs9s/M4pxKlvUZLThtYSjyGDJ2qDYXII6DeBgxHGUoUxc5A6a

>>> HVF04gG77U2fMCa/6eGlQN2380kNCjdyRCDUZc9St3tbQPnWU+syk6U/inF7bhAA

>>> 7ONJD0dcjZROmblqurv32pO6sLuS8wA4DfJhpM5xSSJcYI46YQtVWL4OXY+dtx6S

>>> 6uQ1fiPqgo4WM0iHEOnh7BEz0NqZeahIUJJVmgKODzp2krOqbpOpbwe7WUI7UHmK

>>> 3LCNC9oMRybNuc+jrbHqFwT+tgQLTqpbHZuDVzKkBcxqPSj7hRvjLXAjkWNCzL7j

>>> Yo4MySS6rzenuj9ZRSrQDSYfowRZyzPzMnmjkMbM7R7wpR5CL4U95LqOdMnce2IG

>>> s/6iYcuUH8KqpG9NMy0U

>>> =TnKW

>>> -----END PGP SIGNATURE-----

>>> _______________________________________________

>>> Computer-go mailing list

>>> Computer-go@computer-go.org

>>> http://computer-go.org/mailman/listinfo/computer-go

>>

>>

>>

>> _______________________________________________

>> Computer-go mailing list

>> Computer-go@computer-go.org

>> http://computer-go.org/mailman/listinfo/computer-go

>_______________________________________________

>Computer-go mailing list

>Computer-go@computer-go.org

>http://computer-go.org/mailman/listinfo/computer-go
-- 
Hideki Kato <mailto:hideki_ka...@ybb.ne.jp>
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search (value network)

Reply via email to