Re: [Computer-go] AlphaGo Zero

Gian-Carlo Pascutto Thu, 26 Oct 2017 08:56:34 -0700

On 25-10-17 16:00, Petr Baudis wrote:
> That makes sense.  I still hope that with a much more aggressive 
> training schedule we could train a reasonable Go player, perhaps at
> the expense of worse scaling at very high elos...  (At least I feel 
> optimistic after discovering a stupid bug in my code.)


By the way, a trivial observation: the initial network is random, so
there's no point in using it for playing the first batch of games. It
won't do anything useful until it has run a learning pass on a bunch of
"win/loss" scored games and it can at least tell who is the likely
winner in the final position (even if it mostly won't be able to make
territory at first).

This suggests that bootstrapping probably wants 500k starting games with
just random moves.

FWIW, it does not seem easy to get the value part of the network to
converge in the dual-res architecture, even when taking the appropriate
steps (1% weighting on error, strong regularizer).

-- 
GCP
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo Zero

Reply via email to