Detlef,

I misunderstand your last sentence. Do you mean that eventually you'll put
a subset of functioning nets on CGOS to measure how quickly their strength
is improving?

s.

On Nov 6, 2017 4:54 PM, "Detlef Schmicker" <[email protected]> wrote:

> I thought it might be fun to have some games in early stage of learning
> from nearly Zero knowledge.
>
> I did not turn off the (relatively weak) playouts and mix them with 30%
> into the result from the value network. I started at an initial random
> neural net (small one, about 4ms on GTX970) and use a relatively wide
> search for MC (much much wider, than I do for good playing strength,
> unpruning about 5-6 moves) and 100 playouts expanding every 3 playouts,
> thus 33 network evaluations per move.
>
> Additionally I add Gaussian random numbers with a standard derivation of
> 0.02 to the policy network.
>
> With this setup I play 1000 games and do an reinforcement learning cycle
> with them. One cycle takes me about 5 hours.
>
> The first 2 days I did not archive games, than I noticed it might be fun
> having games from the training history: now I always archive one game
> per cycle.
>
>
> Here are some games ...
>
>
> http://physik.de/games_during_learning/
>
>
> I will probably add some more games, if I have them and will try to
> measure, how strong the bot is with exactly this (weak broad search )
> configuration but a pretrained net from 4d+ kgs games on CGOS...
>
>
> Detlef
> _______________________________________________
> Computer-go mailing list
> [email protected]
> http://computer-go.org/mailman/listinfo/computer-go
_______________________________________________
Computer-go mailing list
[email protected]
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to