t; > #1: Winrate 55%, +5 expected final points
> > #2: Winrate 53%, +15 expected final points
> >
> > Is the move with higher winrate always better? Or would there be some
> > benefit to choosing #2? Would this differ depending on how far along
> > the game is?
> >
Opent to intepretation if this method is brute force. I think it i. Uses
huge amounts of CPU power to run simulations and evaluate NN's. Even in
chess it was not just about tree search, it needs evaluationfunction ot
make sense of the search
2016-02-24 6:52 GMT+02:00 muupan :
>
Congratulations, people at DeepMind! Your paper is very interesting to read.
I have a question about the paper. On policy network training it says
> On the first pass through the training pipeline, the baseline was set to
zero; on the second pass we used the value network vθ(s) as a baseline;
If you accumulate end scores of playout results, you can make a histogram by
plotting the frequency of a score f(s) as a function of the score. The winrate
is the sum(f(s)) where s > 0. The average score is sum(s * f(s)) / sum(s)
summed over all s.
When the distibution can be approximated by a
On Tue, Feb 23, 2016 at 4:41 PM, Justin .Gilmer wrote:
> I made a similar attempt as Alvaro to predict final ownership. You can
> find the code here: https://github.com/jmgilmer/GoCNN/. It's trained to
> predict final ownership for about 15000 professional games which were
>
; > #1: Winrate 55%, +5 expected final points
> > #2: Winrate 53%, +15 expected final points
> >
> > Is the move with higher winrate always better? Or would there be some
> > benefit to choosing #2? Would this differ depending on how far along
>
On 23.02.2016 11:36, Michael Markefka wrote:
whether one could train a DCNN for expected territory
First, some definition of territory must be chosen or stated. Second,
you must decide if territory according to this definition can be
determined by a neural net meaningfully at all. Third, if
I have experimented with a CNN that predicts ownership, but I found it to
be too weak to be useful. The main difference between what Google did and
what I did is in the dataset used for training: I had tens of thousands of
games (I did several different experiments) and I used all the positions
Hello everyone,
in the wake of AlphaGo using a DCNN to predict expected winrate of a
move, I've been wondering whether one could train a DCNN for expected
territory or points successfully enough to be of some use (leaving the
issue of win by resignation for a more in-depth discussion). And,