Working on it :)

David

> -----Original Message-----
> From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf
> Of "Ingo Althöfer"
> Sent: Tuesday, February 23, 2016 7:56 AM
> To: computer-go@computer-go.org
> Subject: *****SPAM***** Re: [Computer-go] Move evalution by expected
> value, as product of expected winrate and expected points?
> 
> My 1.5 cent:
> 
> David Fotland has a nice score-estimator in his (old) ManyFaces bot.
> The score estimator is still from the days before the Monte Carlo
> version.
> 
> Perhaps, David can improve on this estimator with help of CNNs.
> 
> Ingo.
> 
> 
> 
> Gesendet: Dienstag, 23. Februar 2016 um 16:41 Uhr Von: "Justin .Gilmer"
> <jmgil...@gmail.com> An: computer-go@computer-go.org
> Betreff: Re: [Computer-go] Move evalution by expected value, as product
> of expected winrate and expected points?
> 
> I made a similar attempt as Alvaro to predict final ownership. You can
> find the code here: https://github.com/jmgilmer/GoCNN/. It's trained to
> predict final ownership for about 15000 professional games which were
> played until the end (didn't end in resignation). It gets about 80.5%
> accuracy on a held out test set, although the accuracy greatly varies
> based on how far through the game you are. Can't say how well it would
> work in a go player.  -Justin   On Tue, Feb 23, 2016 at 7:00 AM,
> <computer-go-requ...@computer-go.org[computer-go-request@computer-
> go.org]> wrote:Send Computer-go mailing list submissions to
> computer-go@computer-go.org[computer-go@computer-go.org]
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> http://computer-go.org/mailman/listinfo/computer-go[http://computer-
> go.org/mailman/listinfo/computer-go]
> or, via email, send a message with subject or body 'help' to
> computer-go-requ...@computer-go.org[computer-go-requ...@computer-go.org]
> 
> You can reach the person managing the list at         computer-go-
> ow...@computer-go.org[computer-go-ow...@computer-go.org]
> 
> When replying, please edit your Subject line so it is more specific than
> "Re: Contents of Computer-go digest..."
> 
> 
> Today's Topics:
> 
>    1. Re: Congratulations to Zen! (Robert Jasiek)    2. Move evalution
> by expected value, as product of expected       winrate and expected
> points? (Michael Markefka)    3. Re: Move evalution by expected value,
> as product of expected       winrate and expected points? ( lvaro Begu )
> 4. Re: Move evalution by expected value, as product of expected
> winrate and expected points? (Robert Jasiek)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Mon, 22 Feb 2016 19:13:20 +0100
> From: Robert Jasiek <jas...@snafu.de[jas...@snafu.de]>
> To: computer-go@computer-go.org[computer-go@computer-go.org]
> Subject: Re: [Computer-go] Congratulations to Zen!
> Message-ID: <56cb4fc0.4010...@snafu.de[56cb4fc0.4010...@snafu.de]>
> Content-Type: text/plain; charset=UTF-8; format=flowed
> 
> Aja, sorry to bother you with trivialities, but how does Alphago avoid
> power or network failures and such incidents?
> 
> --
> robert jasiek
> 
> 
> ------------------------------
> 
> Message: 2
> Date: Tue, 23 Feb 2016 11:36:57 +0100
> From: Michael Markefka
> <michael.marke...@gmail.com[michael.marke...@gmail.com]>
> To: computer-go@computer-go.org[computer-go@computer-go.org]
> Subject: [Computer-go] Move evalution by expected value, as product of
> expected winrate and expected points?
> Message-ID:
>         <CAJg7PAPU_gbHvNy3Cv+D-
> p238_hkqkv5pojxozjly4nsqas...@mail.gmail.com[CAJg7PAPU_gbHvNy3Cv%2BD-
> p238_hkqkv5pojxozjly4nsqas...@mail.gmail.com]>
> Content-Type: text/plain; charset=UTF-8
> 
> Hello everyone,
> 
> in the wake of AlphaGo using a DCNN to predict expected winrate of a
> move, I've been wondering whether one could train a DCNN for expected
> territory or points successfully enough to be of some use (leaving the
> issue of win by resignation for a more in-depth discussion). And,
> whether winrate and expected territory (or points) always run in
> parallel or whether there are diverging moments.
> 
> Computer Go programs play what are considered slack or slow moves when
> ahead, sometimes being too conservative and giving away too much of
> their potential advantage. If expected points and expected winrate
> diverge, this could be a way to make the programs play in a more natural
> way, even if there were no strength increase to be gained.
> Then again there might be a parameter configuration that might yield
> some advantage and perhaps this configuration would need to be dynamic,
> favoring winrate the further the game progresses.
> 
> 
> As a general example for the idea, let's assume we have the following
> potential moves generated by our program:
> 
> #1: Winrate 55%, +5 expected final points
> #2: Winrate 53%, +15 expected final points
> 
> Is the move with higher winrate always better? Or would there be some
> benefit to choosing #2? Would this differ depending on how far along the
> game is?
> 
> If we knew the winrate prediction to be perfect, then going by that
> alone would probably result in the best overall performance. But given
> some uncertainty there, expected value could be interesting.
> 
> 
> Any takers for some experiments?
> 
> 
> -Michael
> 
> 
> ------------------------------
> 
> Message: 3
> Date: Tue, 23 Feb 2016 06:44:04 -0500
> From:  lvaro Begu  <alvaro.be...@gmail.com[alvaro.be...@gmail.com]>
> To: computer-go <computer-go@computer-go.org[computer-go@computer-
> go.org]>
> Subject: Re: [Computer-go] Move evalution by expected value, as
> product of expected winrate and expected points?
> Message-ID:
>         <CAF8dVMWLPQBhD-
> Q07YeLZwqV9M9JCW+_VbSRVp=evj9cn6w...@mail.gmail.com[evj9cn6w...@mail.gma
> il.com]>
> Content-Type: text/plain; charset="utf-8"
> 
> I have experimented with a CNN that predicts ownership, but I found it
> to be too weak to be useful. The main difference between what Google did
> and what I did is in the dataset used for training: I had tens of
> thousands of games (I did several different experiments) and I used all
> the positions from each game (which is known to be problematic); they
> used 30M positions from independent games. I expect you can learn a lot
> about ownership and expected number of points from a dataset like that.
> Unfortunately, generating such a dataset is infeasible with the
> resources most of us have.
> 
> Here's an idea: Google could make the dataset publicly available for
> download, ideally with the final configurations of the board as well.
> There is a tradition of making interesting datasets for machine learning
> available, so I have some hope this may happen.
> 
> The one experiment I would like to make along the lines of your post is
> to train a CNN to compute both the expected number of points and its
> standard deviation. If you assume the distribution of scores is well
> approximated by a normal distribution, maximizing winning probability
> can be achieved by maximizing (expected score) / (standard deviation of
> the score). I wonder if that results in stronger or more natural play
> than making a direct model for winning probability, because you get to
> learn more about each position.
> 
>  lvaro.
> 
> 
> 
> On Tue, Feb 23, 2016 at 5:36 AM, Michael Markefka <
> michael.marke...@gmail.com[michael.marke...@gmail.com]> wrote:
> 
> > Hello everyone,
> >
> > in the wake of AlphaGo using a DCNN to predict expected winrate of a
> > move, I've been wondering whether one could train a DCNN for expected
> > territory or points successfully enough to be of some use (leaving the
> > issue of win by resignation for a more in-depth discussion). And,
> > whether winrate and expected territory (or points) always run in
> > parallel or whether there are diverging moments.
> >
> > Computer Go programs play what are considered slack or slow moves when
> > ahead, sometimes being too conservative and giving away too much of
> > their potential advantage. If expected points and expected winrate
> > diverge, this could be a way to make the programs play in a more
> > natural way, even if there were no strength increase to be gained.
> > Then again there might be a parameter configuration that might yield
> > some advantage and perhaps this configuration would need to be
> > dynamic, favoring winrate the further the game progresses.
> >
> >
> > As a general example for the idea, let's assume we have the following
> > potential moves generated by our program:
> >
> > #1: Winrate 55%, +5 expected final points
> > #2: Winrate 53%, +15 expected final points
> >
> > Is the move with higher winrate always better? Or would there be some
> > benefit to choosing #2? Would this differ depending on how far along
> > the game is?
> >
> > If we knew the winrate prediction to be perfect, then going by that
> > alone would probably result in the best overall performance. But given
> > some uncertainty there, expected value could be interesting.
> >
> >
> > Any takers for some experiments?
> >
> >
> > -Michael
> > _______________________________________________
> > Computer-go mailing list
> > Computer-go@computer-go.org[Computer-go@computer-go.org]
> > http://computer-go.org/mailman/listinfo/computer-go[http://computer-go
> > .org/mailman/listinfo/computer-go]
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://computer-go.org/pipermail/computer-
> go/attachments/20160223/700a08a3/attachment-0001.html[http://computer-
> go.org/pipermail/computer-go/attachments/20160223/700a08a3/attachment-
> 0001.html]>
> 
> ------------------------------
> 
> Message: 4
> Date: Tue, 23 Feb 2016 12:54:22 +0100
> From: Robert Jasiek <jas...@snafu.de[jas...@snafu.de]>
> To: computer-go@computer-go.org[computer-go@computer-go.org]
> Subject: Re: [Computer-go] Move evalution by expected value, as
> product of expected winrate and expected points?
> Message-ID: <56cc486e.1030...@snafu.de[56cc486e.1030...@snafu.de]>
> Content-Type: text/plain; charset=UTF-8; format=flowed
> 
> On 23.02.2016 11:36, Michael Markefka wrote:
> > whether one could train a DCNN for expected territory
> 
> First, some definition of territory must be chosen or stated. Second,
> you must decide if territory according to this definition can be
> determined by a neural net meaningfully at all. Third, if yes, do it.
> 
> Note that there are very different definitions of territory. The most
> suitable definition for positional judgement (see Positional Judgement 1
> - Territory) is sophisticated and requires a combination of expert rules
> (specifying for what to detemine, and how to read to determine it) and
> reading.
> 
> A weak definition could predict whether a particular intersections will
> be territory in the game end's scoring position. Such can be fast for MC
> or NN, and maybe such is good enough as a very rough approximation for
> programs. For humans, such is very bad because it neglects different
> degrees of safety of (potential) territory and the strategic concepts of
> sacrifice and exchange.
> 
> I have also suggested other definitions, but IMO they are less
> attractive for NN.
> 
> --
> robert jasiek
> 
> 
> ------------------------------
> 
> Subject: Digest Footer
> 
> _______________________________________________
> Computer-go mailing list
> Computer-go@computer-go.org[Computer-go@computer-go.org]
> http://computer-go.org/mailman/listinfo/computer-go
> 
> ------------------------------
> 
> End of Computer-go Digest, Vol 73, Issue 42
> *******************************************_____________________________
> __________________ Computer-go mailing list Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go[http://computer-
> go.org/mailman/listinfo/computer-go]
> _______________________________________________
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go

_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to