Good point, Roel. Perhaps in the final layers one could make it predict a
model of the expected score distribution (before combining with the komi
and other rules specific adjustments for handicap stones, pass stones,
last-play parity, etc.). Should be easy enough to back-propagate win/loss
information (and perhaps even more) through such a model.


On Thu, Oct 26, 2017 at 3:55 PM, Roel van Engelen <[email protected]>
wrote:

> @Gian-Carlo Pascutto
>
> Since training uses a ridiculous amount of computing power i wonder if it
> would
> be useful to make certain changes for future research, like training the
> value head
> with multiple komi values <https://arxiv.org/pdf/1705.10701.pdf>
>
> On 26 October 2017 at 03:02, Brian Sheppard via Computer-go <
> [email protected]> wrote:
>
>> I think it uses the champion network. That is, the training periodically
>> generates a candidate, and there is a playoff against the current champion.
>> If the candidate wins by more than 55% then a new champion is declared.
>>
>>
>>
>> Keeping a champion is an important mechanism, I believe. That creates the
>> competitive coevolution dynamic, where the network is evolving to learn how
>> to beat the best, and not just most recent. Without that dynamic, the
>> training process can go up and down.
>>
>>
>>
>> *From:* Computer-go [mailto:[email protected]] *On
>> Behalf Of *uurtamo .
>> *Sent:* Wednesday, October 25, 2017 6:07 PM
>> *To:* computer-go <[email protected]>
>> *Subject:* Re: [Computer-go] Source code (Was: Reducing network size?
>> (Was: AlphaGo Zero))
>>
>>
>>
>> Does the self-play step use the most recent network for each move?
>>
>>
>>
>> On Oct 25, 2017 2:23 PM, "Gian-Carlo Pascutto" <[email protected]> wrote:
>>
>> On 25-10-17 17:57, Xavier Combelle wrote:
>> > Is there some way to distribute learning of a neural network ?
>>
>> Learning as in training the DCNN, not really unless there are high
>> bandwidth links between the machines (AFAIK - unless the state of the
>> art changed?).
>>
>> Learning as in generating self-play games: yes. Especially if you update
>> the network only every 25 000 games.
>>
>> My understanding is that this task is much more bottlenecked on game
>> generation than on DCNN training, until you get quite a bit of machines
>> that generate games.
>>
>> --
>> GCP
>> _______________________________________________
>> Computer-go mailing list
>> [email protected]
>> http://computer-go.org/mailman/listinfo/computer-go
>>
>>
>> _______________________________________________
>> Computer-go mailing list
>> [email protected]
>> http://computer-go.org/mailman/listinfo/computer-go
>>
>
>
> _______________________________________________
> Computer-go mailing list
> [email protected]
> http://computer-go.org/mailman/listinfo/computer-go
>
_______________________________________________
Computer-go mailing list
[email protected]
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to