Re: [Computer-go] Are the AlphaGols coming?

terry mcintyre Thu, 05 Jan 2017 23:34:39 -0800

 blockquote, div.yahoo_quoted { margin-left: 0 !important; border-left:1px 
#715FFA solid !important; padding-left:1ex !important; background-color:white 
!important; } During its training, AlphaGo played many handicap games against a 
previous version of itself, so the team and the program are acquainted with 
handicap play. I don't recall reading any discussion of komi experiments. 
Google's advantage is that they can dynamically spin up bunches of processors 
to try all sorts of experiments, including tournaments designed to test various 
versions, theories, and tweaks. There was some discussion about what to do if 
one could spin up thousands of processr cores on demand. 
There are surely large businesses in China which could do the same. They have 
similar reasons to pursue deep learning and other creative uses of big data and 
supercomputing.



Sent from Yahoo Mail for iPad


On Thursday, January 5, 2017, 9:36 PM, David Ongaro <david.ong...@hamburg.de> 
wrote:This discussion reminds me of an incident which happened at the EGC in 
Tuchola 2004 (maybe someone can find a source for this). I don’t remember all 
details but it was about like this:


Two amateur players where analyzing a Game and a professional player happened 
to come by. So they asked him how he would assess the position. After a quick 
look he said “White is leading by two points”. The two players where wondering: 
“You can count that quickly?”, but the pro answered “No, I just asked myself if 
I would like to have black in this position. The answer is no. But with two 
extra Komi for Black it would feel ok.”
So it seems professionals already acquired some kind of “value network” due to 
their hard training, but they also can modify its assessments by taking Komi 
into account. Maybe that's something we also should do, i.e. not only train the 
value network by taking go positions and results into account but also add the 
Komi as an input (the output would still be a simple win/lose result). In that 
way we don’t have to train a different network for each Komi, even though the 
problem getting enough training data for all Komi values still remains.


On Jan 5, 2017, at 11:44 AM, David Ongaro <david.ong...@hamburg.de> wrote:


On Jan 5, 2017, at 2:37 AM, Detlef Schmicker <d...@physik.de> wrote:

Hi,

what makes you think the opening theory with reverse komi would be the
same as with standard komi?

I would be afraid to invest an enormous amount of time just to learn,
that you have to open differently in reverse komi games :)


Thats why I used the comparative adjective “less”. It might not be ideal, but 
still much better than changing the fundamental structure of the opening with 
an extra stone. Furthermore the effect might not as big as you think:

1. The stronger player doesn’t have to play overplays when the handicap is 
correct. If the handicap is correct and if AlphaGo “knows” that is another 
question though… Of course the weaker player might play differently (i.e. more 
safely) but at least that is something he or she can control
2. One could even argue the other way around:  we might see more sound 
(theoretically correct) moves from AlphaGo with reverse Komi. If it's seeing 
itself ahead already during the opening it might resort to slack but safe 
moves. Since it’s still winning we can be left wondering if it was actually a 
good move. But if it does an unusual looking move which it can’t be considered 
an overplay but it’s still winning in the end with reverse Komi there should be 
a real insight to gain.

Still, a reverse Komi handicap is rather big, but it might be the next best 
thing we have without retraining the value network from scratch. Furthermore 
retraining the value network will probably affect the playing style even more.

Thanks,

David O.



Am 05.01.2017 um 10:50 schrieb Paweł Morawiecki:

2017-01-04 21:07 GMT+01:00 David Ongaro <david.ong...@hamburg.de>:



[...]So my question is: is it possible to have reverse Komi games
by feeding the value network with reverse colors?



In the paper from Nature (subsection "Features for policy/value
network"), authors state:

*the stone colour at each intersection was represented as either
player or opponent rather than black or white. *

Then, I think the AlphaGo algorithm would be fine with a reverse
komi. Namely, a human player takes black and has 7.5 komi. The next
step is that AlphaGo gives 2 stones of handicap but keeps 7.5 komi
(normally you have 0.5).

Aja, can you confirm this?



Also having 2 stone games is not so interesting since it would
reveal less insights for even game opening Theory.



I agree with David here, most insights we would get from even
games. But we can imagine the following show. Some games are played
with a reverse komi, some games would be played with 2 stones (yet,
white keeps 7.5 komi) and eventually the main event with normal
even games to debunk our myths on the game. Wouldn't be super
exciting!?

Best regards, Paweł



_______________________________________________ Computer-go mailing
list Computer-go@computer-go.org 
http://computer-go.org/mailman/listinfo/computer-go


_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go


_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Are the AlphaGols coming?

Reply via email to