AlphaGo Zero's Neural Network takes a 19x19x17 input representing the current and 15 previous board positons, and the side to play. What if you were to only give it the current board position and side to play, and you handled all illegal ko moves only in the tree?
So obviously the network cannot distinguish between two identical positions one where there is an illegal ko move and one where there is not. But after running MCTS long enough and expanding the tree AGZ should understand what is going on, right? Does this just make it require more time to find the best move, or is it somehow fundamentally broken? The only thing I can think of is that ko threats might sometimes linger for a very long time, so maybe this is a big problem, but my understanding of Go is limited. For comparison, the original AlphaGo used a feature plane of ones and zeros to indicate legal and illegal moves.
_______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go