On 05.01.2017 17:32, Jim O'Flaherty wrote:
I don't follow.

1) "For each arcane position reached, there would now be ample data for AlphaGo to train on that particular pathway." is false. See below.

2) "two strategies. The first would be to avoid the state in the first place." Does AlphaGo have any strategy ever? If it does, does it have strategies of avoiding certain types of positions?

3) "the second would be to optimize play in that particular state." If you mean optimise play = maximise winning probability.

But... optimising this is hard when (under positional superko) optimal play can be ca. 13,500,000 moves long and the tree to that is huge. Even TPU sampling can be lost then.

Afterwards, there is still only one position from which to train. For NN learning, one position is not enough and cannot replace analysis by mathematical proofs ALA the NN does not emulate mathematical proving.

--
robert jasiek
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to