Re: [Computer-go] AlphaZero tensorflow implementation/tutorial

2018-12-09 Thread Dani
Thanks for the tutorial! I have some questions about training

a) Do you use Dirichlet noise during training, if so is it limited to first
30 or so plies ( which is the opening phase of chess) ?
The alphazero paper is not clear about it.

b) Do you need to shuffle batches if you are doing one epoch? Also after
generating game positions from each game,
do you shuffle those postions? I found the latter to be very important to
avoid overfitting.

c) Do you think there is a problem with using Adam Optimizer instead of SGD
with learning rate drops?

Daniel

On Sun, Dec 9, 2018 at 6:23 PM cody2007 via Computer-go <
computer-go@computer-go.org> wrote:

> Thanks for your comments.
>
> >looks you made it work on a 7x7 19x19 would probably give better result
> especially against yourself if you are a complete novice
> I'd expect that'd make me win even more against the algorithm since it
> would explore a far smaller amount of the search space, right?
> Certainly something I'd be interested in testing though--I just would
> expect it'd take many months more months of training however, but would be
> interesting to see how much performance falls apart, if at all.
>
> >for not cheating against gnugo, use --play-out-aftermath of gnugo
> parameter
> Yep, I evaluate with that parameter. The problem is more that I only play
> 20 turns per player per game. And the network seems to like placing stones
> in terrotories "owned" by the other player. My scoring system then no
> longer counts that area as owned by the player. Probably playing more turns
> out and/or using a more sophisticated scoring system would fix this.
>
> >If I don't mistake a competitive ai would need a lot more training such
> what does leela zero https://github.com/gcp/leela-zero
> Yeah, I agree more training is probably the key here. I'll take a look at
> leela-zero.
>
> ‐‐‐ Original Message ‐‐‐
> On Sunday, December 9, 2018 7:41 PM, Xavier Combelle <
> xavier.combe...@gmail.com> wrote:
>
> looks you made it work on a 7x7 19x19 would probably give better result
> especially against yourself if you are a complete novice
>
> for not cheating against gnugo, use --play-out-aftermath of gnugo parameter
>
> If I don't mistake a competitive ai would need a lot more training such
> what does leela zero https://github.com/gcp/leela-zero
> Le 10/12/2018 à 01:25, cody2007 via Computer-go a écrit :
>
> Hi all,
>
> I've posted an implementation of the AlphaZero algorithm and brief
> tutorial. The code runs on a single GPU. While performance is not that
> great, I suspect its mostly been limited by hardware limitations (my
> training and evaluation has been on a single Titan X). The network can beat
> GNU go about 50% of the time, although it "abuses" the scoring a little
> bit--which I talk a little more about in the article:
>
>
> https://medium.com/@cody2007.2/alphazero-implementation-and-tutorial-f4324d65fdfc
>
> -Cody
>
> ___
> Computer-go mailing 
> listComputer-go@computer-go.orghttp://computer-go.org/mailman/listinfo/computer-go
>
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] New paper by DeepMind

2018-12-06 Thread Dani
What exactly is the innovation that is patented ?
Using short look-ahead searches for tuning evaluation functions ( in this
case a neural network ) is not exactly new.

On Thu, Dec 6, 2018 at 3:28 PM Rémi Coulom  wrote:

> Hi,
>
> The new alphazero paper of DeepMind about chess and shogi has been
> published in Science:
>
>
> https://deepmind.com/blog/alphazero-shedding-new-light-grand-games-chess-shogi-and-go/
>
> pdf:
> https://deepmind.com/documents/260/alphazero_preprint.pdf
>
> I tried to play "spot the difference" with their previous draft, and did
> not notice any very important difference. They include shogi games, which
> might be appreciated by the shogi players. It seems they still don't tell
> the value of their exploration coefficient, unless I missed anything.
>
> Also, the AlphaZero algorithm is patented:
> https://patentscope2.wipo.int/search/en/detail.jsf?docId=WO2018215665
>
> Rémi
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go