date:20170522

Re: [Computer-go] AlphaGo watch parties planned across U.S.

2017-05-22 Thread Hiroshi Yamashita


NicoNico movie shows Zen's evaluation. (In Japanese)
http://live2.nicovideo.jp/watch/lv298444014

Zen thinks White(AlphaGo) 's winrate is 62% at 86th moves.

Hiroshi Yamashita

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo watch parties planned across U.S.

2017-05-22 Thread Hiroshi Yamashita


I'll sometimes post Aya's evaluation.

https://twitter.com/nhk_igo_15bai

Thanks,
Hiroshi Yamashita

- Original Message - 
From: "Horace Ho" 

To: "computer-go" 
Sent: Tuesday, May 23, 2017 10:47 AM
Subject: Re: [Computer-go] AlphaGo watch parties planned across U.S.



 https://www.youtube.com/watch?v=Z-HL5nppBnM

On Tue, May 23, 2017 at 8:27 AM, Hideki Kato  wrote:


Live will start at 11:00 AM JST (GMT+0900).
https://events.google.com/alphago2017/

Hideki


___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] mini-max with Policy and Value network

2017-05-22 Thread David Wu

Addendum:

Some additional playing around with the same position can flip the roles of
the playouts and value net - so now the value net is very wrong and the
playouts are mostly right. I think this gives good insight into what the
value net is doing and why as a general matter playouts are still useful.

Here's how:
Play black moves as detailed in the previous email in the "leela10.sgf"
game that Marc posted to resolve all the misunderstandings of the playouts
and get it into the "3-10% white win" phase, but otherwise leave white's
dead group on-board with tons of liberties. Let white have an absolute
feast on the rest of the board while black simply connects his stones
solidly. White gets to cut through Q6 and get pretty much every point
available.

Black is still winning even though he loses almost the entire rest of the
board, as long as the middle white group dies. But with some fiddling
around, you can arrive at a position where the value net is reporting 90%
white win (wrong), while the playouts are rightly reporting only 3-10%
white win.

Intuitively, the value net only fuzzily evaluates white's group as probably
dead, but isn't sure that it's dead, so intuitively it counts some value
for white's group "in expectation" for the small chance it lives. And the
score is otherwise not too far off on the rest of the board - the way I
played it out, black wins by only ~5 points if white dies. So the small
uncertainty that the huge white group is might actually be alive produces
enough "expected value" for white to overwhelm the 5 point loss margin,
such that the value net is 90% sure that white wins.

What the value net has failed to "understand" here is that white's group
surviving is a binary event. I.e. a 20% chance of the group being alive and
white winning by 80 points along with a 80% chance that it's dead and white
losing by 5 points does not average out to white being (0.2 * 80) - (0.8 *
5) = 16 points ahead overall (although probably the value net doesn't
exactly "think" in terms of points but rather something fuzzier). The
playouts provide the much-needed "understanding" that win/loss is binary
and that the expectation operator should be applied after mapping to
win/loss outcomes, rather than before.

It seems intuitive to me that a neural net would compute things in too much
of a fuzzy and averaged way and thereby be vulnerable to this mistake. I
wonder if it's possible to train a value net to get these things more
correct without weakening it otherwise, with the right training. As it is,
I suspect this is a systematic flaw in the value net's ability to produce
good probabilities of winning in games where the game hinges on the life
and death chances of a single large dragon, and where the expected score
could be wildly uncorrelated with the probability of winning.

On Mon, May 22, 2017 at 9:39 PM, David Wu  wrote:

> Leela playouts are definitely extremely bad compared to competitors like
> Crazystone. The deep-learning version of Crazystone has no value net as far
> as I know, only a policy net, which means it's going on MC playouts alone
> to produce its evaluations. Nonetheless, its playouts often have noticeable
> and usually correct opinions about early midgame game positions (as
> confirmed by the combination of own judgment as a dan player and Leela's
> value net). Which I find amazing - that it can even approximately get these
> right.
>
> On to the game:
>
> Analyzing with Leela 0.10.0 in that second game, I think I can infer
> pretty exactly what the playouts are getting wrong. Indeed the upper left
> is being disastrously misplayed by them, but that's not all. -  I'm finding
> that multiple different things are all being played out wrong. All of the
> following numbers are on white's turn, to give white the maximal chance to
> distract black from resolving the tactics in the tree search and forcing
> the playouts to do the work - the numbers are sometimes better if it's
> black's turn.
>
> * At move 186, playouts as it stands show about on the order of 60% for
> white, despite black absolutely having won the game. I partly played out
> the rest of the board in a very slow and solid way just in case it was
> confusing things, but not entirely, so that the tree search would still
> have plenty of endgame moves to be distracted by. Playouts stayed at 60%
> for white.
>
> * Add bA15, putting white down to 2 liberties: the PV shows an exchange of
> wA17 and bA14, keeping white at 2 liberties, and it drops to 50%.
> * Add bA16, putting white in atari: it drops to 40%.
>
> So clearly there's some funny business with black in the playouts
> self-atariing or something, and the chance that black does this lessens as
> white has fewer liberties and therefore is more likely to die first. Going
> further:
>
> * Add bE14, it drops to 30%.
>
> So the potential damezumari is another problem - I guess like 10% of the
> playouts are letting white kill the big black lump at

Re: [Computer-go] AlphaGo watch parties planned across U.S.

2017-05-22 Thread Horace Ho

  https://www.youtube.com/watch?v=Z-HL5nppBnM

On Tue, May 23, 2017 at 8:27 AM, Hideki Kato  wrote:

> Live will start at 11:00 AM JST (GMT+0900).
> https://events.google.com/alphago2017/
>
> Hideki
>
> Stephen Martindale:  ke+r++xfvhgf4xrrts7w...@mail.gmail.com>:
> >Interesting... but I don't see the links for "watch the games live", only
> >the schedule. I assume that the page will be updated when the games
> >actually start... or are about to.
> >
> >Yours Faithfully,
> >*Stephen Martindale*
> >
> >+49 162 577 2166
> >stephen.c.martind...@gmail.com
> >
> >On 22 May 2017 at 09:07, Hideki Kato  wrote:
> >
> >> Probably here.
> >> http://events.google.com/alphago2017/
> >> "Below, you can find out the game schedule, watch the games live
> >> or replay the highlights."
> >>
> >> Hideki
> >>
> >> Joshua Shriver:  ZizJaDp5N1JthzJA=A-DQ@mail.
> >> gmail.com>:
> >> >Where can it be viewed online?
> >> >
> >> >-Josh
> >> >
> >> >On Sun, May 21, 2017 at 9:09 PM, Ray Tayek  wrote:
> >> >>
> >> >http://www.usgo.org/news/2017/05/alphago-watch-parties-
> >> planned-across-u-s/
> >> >>
> >> >> looks like 10:30 pm monday.
> >> >>
> >> >> thanks
> >> >>
> >> >> --
> >> >> Honesty is a very expensive gift. So, don't expect it from cheap
> people
> >> -
> >> >> Warren Buffett
> >> >> http://tayek.com/
> >> >> ___
> >> >> Computer-go mailing list
> >> >> Computer-go@computer-go.org
> >> >> http://computer-go.org/mailman/listinfo/computer-go
> >> >___
> >> >Computer-go mailing list
> >> >Computer-go@computer-go.org
> >> >http://computer-go.org/mailman/listinfo/computer-go
> >> --
> >> Hideki Kato 
> >> ___
> >> Computer-go mailing list
> >> Computer-go@computer-go.org
> >> http://computer-go.org/mailman/listinfo/computer-go
> >>
> > inline file
> >___
> >Computer-go mailing list
> >Computer-go@computer-go.org
> >http://computer-go.org/mailman/listinfo/computer-go
> --
> Hideki Kato 
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo watch parties planned across U.S.

2017-05-22 Thread Hideki Kato

Live will start at 11:00 AM JST (GMT+0900).
https://events.google.com/alphago2017/

Hideki

Stephen Martindale: 
:
>Interesting... but I don't see the links for "watch the games live", only
>the schedule. I assume that the page will be updated when the games
>actually start... or are about to.
>
>Yours Faithfully,
>*Stephen Martindale*
>
>+49 162 577 2166
>stephen.c.martind...@gmail.com
>
>On 22 May 2017 at 09:07, Hideki Kato  wrote:
>
>> Probably here.
>> http://events.google.com/alphago2017/
>> "Below, you can find out the game schedule, watch the games live
>> or replay the highlights."
>>
>> Hideki
>>
>> Joshua Shriver:

Re: [Computer-go] mini-max with Policy and Value network

2017-05-22 Thread Marc Landgraf

And talkig about tactical mistakes:
Another game, where a trick joseki early in the game (top right) completely
fools Leela. Leela here play this like it would be done in similar shapes,
but then gets completely blindsided. But to make things worse, it finds the
one way to make the loss the biggest. (note: this is not reliable when
trying this trick joseki, Leela will often lose the 4/5 stones on the left,
but will at least take the one stone on top in sente instead of screwing up
like it did here) Generally this "trick" is not that deep reading wise, but
given its similarity to more common shapes I can understand how the bot
falls for it.
Anyway, Leela manages to fully stabilize the game (given our general
difference in strength, this should come as no surprise), just to throw
away the center group.

But what you should really look at here is Leelas evaluation of the game.

Even very late in the game, the MC part of Leela considers Leela well
ahead, completely misreading the L+D here. Usually in most games Leela
loses to me, the issue comes the other way around. Leela NN strongly
believes in the game to be won, while the MC-part notices the real trouble.
But not here. Now of course this kind of misjudgement also could serve as
explaination how this group could die in first place.

But having had my own MC-Bot I really wonder how it could misevaluate so
badly here. To really lose this game as Black it either requires
substantial self ataris by Black, or large unanswered self atari by White.
Does Leela have such light playouts that those groups can really flip
status in 60%+ of the MC-Evaluations?

2017-05-22 20:46 GMT+02:00 Marc Landgraf :

> Leela has surprisingly large tactical holes. Right now it is throwing a
> good number of games against me in completely won endgames by fumbling away
> entirely alive groups.
>
> As an example I attached one game of myself (3d), even vs Leela10 @7d. But
> this really isn't a onetime occurence.
>
> If you look around move 150, the game is completely over by human
> standards as well as Leelas evaluation (Leela will give itself >80% here)
> But then Leela starts doing weird things.
> 186 is a minor mistake, but itself does not yet throw the game. But it is
> the start of series of bad turns.
> 236 then is a non-threat in a Ko fight, and checking Leelas evaluation,
> Leela doesn't even consider the possibility of it being ignored. This is
> btw a common topic with Leela in ko fights - it does not look at all at
> what happens if the Ko threat is ignored.
> 238 follows up the "ko threat", but this move isn't doing anything either!
> So Leela passed twice now.
> Suddenly there is some Ko appearing at the top right.
> Leela plays this Ko fight in some suboptimal way, not fully utilizing
> local ko threats, but this is a concept rather difficult to grasp for AIs
> afaik.
> I can not 100% judge whether ignoring the black threat of 253 is correct
> for Leela, I have some doubts on this one too.
> With 253 ignored, the game is now heavily swinging, but to my judgement,
> playing the hane instead of 256 would still keep it rather close and I'm
> not 100% sure who would win it now. But Leela decides to completely bury
> itself here with 256, while giving itself still 70% to win.
> As slowly realization of the real game state kicks in, the rest of the
> game is then the usual MC-throw away style we have known for years.
>
> Still... in this game you can see how a series of massive tactical
> blunders leads to throwing a completely won game. And this is just one of
> many examples. And it can not be all pinned on Ko's. I have seen a fair
> number of games where Leela does similar mistakes without Ko involved, even
> though Ko's drastically increase Leelas fumble chance.
> At the same time, Leela is completely and utterly outplaying me on a
> strategical level and whenever it manages to not make screwups like the
> ones shown I stand no chance at all. Even 3 stones is a serious challenge
> for me then. But those mistakes are common enough to keep me around even.
>
> 2017-05-22 17:47 GMT+02:00 Erik van der Werf :
>
>> On Mon, May 22, 2017 at 3:56 PM, Gian-Carlo Pascutto 
>> wrote:
>>
>>> On 22-05-17 11:27, Erik van der Werf wrote:
>>> > On Mon, May 22, 2017 at 10:08 AM, Gian-Carlo Pascutto >> > > wrote:
>>> >
>>> > ... This heavy pruning
>>> > by the policy network OTOH seems to be an issue for me. My program
>>> has
>>> > big tactical holes.
>>> >
>>> >
>>> > Do you do any hard pruning? My engines (Steenvreter,Magog) always had a
>>> > move predictor (a.k.a. policy net), but I never saw the need to do hard
>>> > pruning. Steenvreter uses the predictions to set priors, and it is very
>>> > selective, but with infinite simulations eventually all potentially
>>> > relevant moves will get sampled.
>>>
>>> With infinite simulations everything is easy :-)
>>>
>>> In

Re: [Computer-go] mini-max with Policy and Value network

2017-05-22 Thread Marc Landgraf

Leela has surprisingly large tactical holes. Right now it is throwing a
good number of games against me in completely won endgames by fumbling away
entirely alive groups.

As an example I attached one game of myself (3d), even vs Leela10 @7d. But
this really isn't a onetime occurence.

If you look around move 150, the game is completely over by human standards
as well as Leelas evaluation (Leela will give itself >80% here)
But then Leela starts doing weird things.
186 is a minor mistake, but itself does not yet throw the game. But it is
the start of series of bad turns.
236 then is a non-threat in a Ko fight, and checking Leelas evaluation,
Leela doesn't even consider the possibility of it being ignored. This is
btw a common topic with Leela in ko fights - it does not look at all at
what happens if the Ko threat is ignored.
238 follows up the "ko threat", but this move isn't doing anything either!
So Leela passed twice now.
Suddenly there is some Ko appearing at the top right.
Leela plays this Ko fight in some suboptimal way, not fully utilizing local
ko threats, but this is a concept rather difficult to grasp for AIs afaik.
I can not 100% judge whether ignoring the black threat of 253 is correct
for Leela, I have some doubts on this one too.
With 253 ignored, the game is now heavily swinging, but to my judgement,
playing the hane instead of 256 would still keep it rather close and I'm
not 100% sure who would win it now. But Leela decides to completely bury
itself here with 256, while giving itself still 70% to win.
As slowly realization of the real game state kicks in, the rest of the game
is then the usual MC-throw away style we have known for years.

Still... in this game you can see how a series of massive tactical blunders
leads to throwing a completely won game. And this is just one of many
examples. And it can not be all pinned on Ko's. I have seen a fair number
of games where Leela does similar mistakes without Ko involved, even though
Ko's drastically increase Leelas fumble chance.
At the same time, Leela is completely and utterly outplaying me on a
strategical level and whenever it manages to not make screwups like the
ones shown I stand no chance at all. Even 3 stones is a serious challenge
for me then. But those mistakes are common enough to keep me around even.

2017-05-22 17:47 GMT+02:00 Erik van der Werf :

> On Mon, May 22, 2017 at 3:56 PM, Gian-Carlo Pascutto 
> wrote:
>
>> On 22-05-17 11:27, Erik van der Werf wrote:
>> > On Mon, May 22, 2017 at 10:08 AM, Gian-Carlo Pascutto > > > wrote:
>> >
>> > ... This heavy pruning
>> > by the policy network OTOH seems to be an issue for me. My program
>> has
>> > big tactical holes.
>> >
>> >
>> > Do you do any hard pruning? My engines (Steenvreter,Magog) always had a
>> > move predictor (a.k.a. policy net), but I never saw the need to do hard
>> > pruning. Steenvreter uses the predictions to set priors, and it is very
>> > selective, but with infinite simulations eventually all potentially
>> > relevant moves will get sampled.
>>
>> With infinite simulations everything is easy :-)
>>
>> In practice moves with, say, a prior below 0.1% aren't going to get
>> searched, and I still regularly see positions where they're the winning
>> move, especially with tactics on the board.
>>
>> Enforcing the search to be wider without losing playing strength appears
>> to be hard.
>>
>>
> Well, I think that's fundamental; you can't be wide and deep at the same
> time, but at least you can chose an algorithm that (eventually) explores
> all directions.
>
> BTW I'm a bit surprised that you are still able to find 'big tactical
> holes' with Leela now playing as 8d KGS
>
> Best,
> Erik
>
>
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>

leela10screwup.sgf
Description: application/go-sgf
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] mini-max with Policy and Value network

2017-05-22 Thread Gian-Carlo Pascutto

On 22-05-17 17:47, Erik van der Werf wrote:
> On Mon, May 22, 2017 at 3:56 PM, Gian-Carlo Pascutto  > wrote:
> 
> Well, I think that's fundamental; you can't be wide and deep at the same
> time, but at least you can chose an algorithm that (eventually) explores
> all directions.

Right. But I'm uncomfortable with the current setup, because many
options won't get explored at all in practical situations. It would seem
logical that some minimum amount of (more spread) search effort would
plug enough holes to stop bad blunders, but finding a way to do that and
preserve strength seems elusive so far.

> BTW I'm a bit surprised that you are still able to find 'big tactical
> holes' with Leela now playing as 8d KGS

I've attached an example. It's not the prettiest one (one side has a 0.5
pt advantage in the critical variation so exact komi is an issue), but
it's a recent one from my mailbox.

This is with Leela 0.10.0 so you can follow along:

Leela: loadsgf tactics2.sgf 261
=

Passes: 0Black (X) Prisoners: 7
White (O) to moveWhite (O) Prisoners: 19

   a b c d e f g h j k l m n o p q r s t
19 . X O O . . . . . . . O O X X X O X . 19
18 . X X O O . . . . O O O X X X O O . O 18
17 . . X X O . . . O O X O O X . X O O . 17
16 . . X O O . . O O X X X X . . X X X . 16
15 . . X O . . O X X X X . . . . . X . X 15
14 . . . X O . O X O O O X . X X O O X(X)14
13 . . . X O O O O O X X X X O O X X X O 13
12 . . . . X X . O X X O O X O . O O X O 12
11 . . . . . . . O O X O X X O . . O O O 11
10 . . . X . X X O O O O O X O O O O . . 10
 9 . . . . X . X X O . O X O X X X O . .  9
 8 . . . . X . X O . . O X O O X O . . .  8
 7 X X X X O X X O . O O X O X X O O . .  7
 6 O O O O O O O . . . O X O . X X O . .  6
 5 X O O . X O O O O O X X X X X O . . .  5
 4 X X O O O X X O . O X X . X O O . . .  4
 3 X . X X X X O . O . O X . . X O . . .  3
 2 X O X . . O O . O . O O X X X O O . .  2
 1 . X . O O . . O O . O X X . X X O . .  1
   a b c d e f g h j k l m n o p q r s t

Hash: 106A3898CEC94132 Ko-Hash: 67E390C41BF2577

Black time: 00:30:00
White time: 00:30:00

Leela: heatmap
=

94.46% G11
 4.20% C1
 1.31% E2
 0.03% all other moves together

Note that O16 P17 N15 wins immediately. It's not that Leela is
completely blind to it, because that sequence is in some variations. But
in here, O16 won't get searched for a long time (it is actually the 4th
rated move) due to the skewed probabilities.

Leela: play w g11
=

Leela: heatmap
=

99.9% F11
 0.1% C1
   0% all other moves together

Leela: genmove b

Score looks bad. Resigning.
= resign

https://timkr.home.xs4all.nl/chess2/resigntxt.htm

If black plays O16 instead here he wins by 0.5 points.

-- 
GCP

tactics2.sgf
Description: application/go-sgf
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] mini-max with Policy and Value network

2017-05-22 Thread Erik van der Werf

On Mon, May 22, 2017 at 3:56 PM, Gian-Carlo Pascutto  wrote:

> On 22-05-17 11:27, Erik van der Werf wrote:
> > On Mon, May 22, 2017 at 10:08 AM, Gian-Carlo Pascutto  > > wrote:
> >
> > ... This heavy pruning
> > by the policy network OTOH seems to be an issue for me. My program
> has
> > big tactical holes.
> >
> >
> > Do you do any hard pruning? My engines (Steenvreter,Magog) always had a
> > move predictor (a.k.a. policy net), but I never saw the need to do hard
> > pruning. Steenvreter uses the predictions to set priors, and it is very
> > selective, but with infinite simulations eventually all potentially
> > relevant moves will get sampled.
>
> With infinite simulations everything is easy :-)
>
> In practice moves with, say, a prior below 0.1% aren't going to get
> searched, and I still regularly see positions where they're the winning
> move, especially with tactics on the board.
>
> Enforcing the search to be wider without losing playing strength appears
> to be hard.
>
>
Well, I think that's fundamental; you can't be wide and deep at the same
time, but at least you can chose an algorithm that (eventually) explores
all directions.

BTW I'm a bit surprised that you are still able to find 'big tactical
holes' with Leela now playing as 8d KGS

Best,
Erik
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] mini-max with Policy and Value network

2017-05-22 Thread Gian-Carlo Pascutto

On 22-05-17 15:46, Erik van der Werf wrote:
> Oh, haha, after reading Brian's post I guess I misunderstood :-)
> 
> Anyway, LMR seems like a good idea, but last time I tried it (in Migos)
> it did not help. In Magog I had some good results with fractional depth
> reductions (like in Realization Probability Search), but it's a long
> time ago and the engines were much weaker then...

What was generating your probabilities, though? A strong policy DCNN or
something weaker?

ERPS (LMR with fractional reductions based on move probabilities) with
alpha-beta seems very similar to having MCTS with the policy prior being
a factor in the UCT formula. This is what AlphaGo did according to their
2015 paper, so it can't be terrible, but it does mean that you are 100%
blind to something the policy network doesn't see, which seems
worrisome. I think I asked Aja once about what they do with first play
urgency given that the paper doesn't address it - he politely ignored
the question :-)

The obvious defense (when looking at it in alpha-beta formulation) would
be to cap the depth reduction, and (in MCTS/UCT formulation) to cap the
minimum probability. I had no success with this in Go so far.

-- 
GCP
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] mini-max with Policy and Value network

2017-05-22 Thread Gian-Carlo Pascutto

On 22-05-17 14:48, Brian Sheppard via Computer-go wrote:
> My reaction was "well, if you are using alpha-beta, then at least use
> LMR rather than hard pruning." Your reaction is "don't use 
> alpha-beta", and you would know better than anyone!

There's 2 aspects to my answer:

1) Unless you've made a breakthrough with value nets, there appears to
be a benefit to keeping the Monte Carlo simulations.

2) I am not sure the practical implementations of both algorithms end up
searching in a different manner.

(1) Is an argument against using alpha-beta. If we want to get rid of
the MC simulations - for whatever reason - it disappears. (2) isn't an
argument against. Stating the algorithm in a different manner may make
some heuristics or optimizations more obvious.

> Yes, LMR in Go has is a big difference compared to LMR in chess: Go 
> tactics take many moves to play out, whereas chess tactics are often
>  pretty immediate.

Not sure I agree with the basic premise here.

> So LMR could hurt Go tactics much more than it hurts chess tactics.
> Compare the benefit of forcing the playout to the end of the game.
LMR doesn't prune anything, it just reduces the remaining search depth
for non-highly rated moves. So it's certainly not going to make
something tactically weaker than hard pruning? If you're talking about
not pruning or reducing at all, you get the issue of the branching
factor again.

In chess you have quiescent search to filter out the simpler tactics. I
guess Monte Carlo simulations may act similar in that they're going to
raise/lower the score if in some simulations tactical shenanigans happen.

-- 
GCP
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo watch parties planned across U.S.

2017-05-22 Thread Hideki Kato

Three hours earlier, DeepMind twitted: 
>The Future of Go Summit begins tomorrow. Catch all #AlphaGo17 
>matches live from 23-27 May at https://goo.gl/pVjexr

Check https://twitter.com/search?q=%23AlphaGo17 !

Hideki

Hideki Kato: <59228e17.7176%hideki_ka...@ybb.ne.jp>:
>Probably here.  
>http://events.google.com/alphago2017/
>"Below, you can find out the game schedule, watch the games live 
>or replay the highlights."
>
>Hideki
>
>Joshua Shriver:

Re: [Computer-go] mini-max with Policy and Value network

2017-05-22 Thread Gian-Carlo Pascutto

On 22-05-17 11:27, Erik van der Werf wrote:
> On Mon, May 22, 2017 at 10:08 AM, Gian-Carlo Pascutto  > wrote:
> 
> ... This heavy pruning
> by the policy network OTOH seems to be an issue for me. My program has
> big tactical holes.
> 
> 
> Do you do any hard pruning? My engines (Steenvreter,Magog) always had a
> move predictor (a.k.a. policy net), but I never saw the need to do hard
> pruning. Steenvreter uses the predictions to set priors, and it is very
> selective, but with infinite simulations eventually all potentially
> relevant moves will get sampled.

With infinite simulations everything is easy :-)

In practice moves with, say, a prior below 0.1% aren't going to get
searched, and I still regularly see positions where they're the winning
move, especially with tactics on the board.

Enforcing the search to be wider without losing playing strength appears
to be hard.

-- 
GCP
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] mini-max with Policy and Value network

2017-05-22 Thread Erik van der Werf

On Mon, May 22, 2017 at 11:27 AM, Erik van der Werf <
erikvanderw...@gmail.com> wrote:

> On Mon, May 22, 2017 at 10:08 AM, Gian-Carlo Pascutto 
> wrote:
>>
>> ... This heavy pruning
>> by the policy network OTOH seems to be an issue for me. My program has
>> big tactical holes.
>
>
> Do you do any hard pruning? My engines (Steenvreter,Magog) always had a
> move predictor (a.k.a. policy net), but I never saw the need to do hard
> pruning. Steenvreter uses the predictions to set priors, and it is very
> selective, but with infinite simulations eventually all potentially
> relevant moves will get sampled.
>
>
Oh, haha, after reading Brian's post I guess I misunderstood :-)

Anyway, LMR seems like a good idea, but last time I tried it (in Migos) it
did not help. In Magog I had some good results with fractional depth
reductions (like in Realization Probability Search), but it's a long time
ago and the engines were much weaker then...
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] mini-max with Policy and Value network

2017-05-22 Thread Brian Sheppard via Computer-go

My reaction was "well, if you are using alpha-beta, then at least use LMR 
rather than hard pruning." Your reaction is "don't use alpha-beta", and you 
would know better than anyone!

Yes, LMR in Go has is a big difference compared to LMR in chess: Go tactics 
take many moves to play out, whereas chess tactics are often pretty immediate. 
So LMR could hurt Go tactics much more than it hurts chess tactics. Compare the 
benefit of forcing the playout to the end of the game.

Best,
Brian

-Original Message-
From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of 
Gian-Carlo Pascutto
Sent: Monday, May 22, 2017 4:08 AM
To: computer-go@computer-go.org
Subject: Re: [Computer-go] mini-max with Policy and Value network

On 20/05/2017 22:26, Brian Sheppard via Computer-go wrote:
> Could use late-move reductions to eliminate the hard pruning. Given 
> the accuracy rate of the policy network, I would guess that even move
> 2 should be reduced.
> 

The question I always ask is: what's the real difference between MCTS with a 
small UCT constant and an alpha-beta search with heavy Late Move Reductions? 
Are the explored trees really so different?

In any case, in my experiments Monte Carlo still gives a strong benefit, even 
with a not so strong Monte Carlo part. IIRC it was the case for AlphaGo too, 
and they used more training data for the value network than is publicly 
available, and Zen reported the same: Monte Carlo is important.

The main problem is the "only top x moves part". Late Move Reductions are very 
nice because there is never a full pruning. This heavy pruning by the policy 
network OTOH seems to be an issue for me. My program has big tactical holes.

--
GCP
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] mini-max with Policy and Value network

2017-05-22 Thread Erik van der Werf

On Mon, May 22, 2017 at 10:08 AM, Gian-Carlo Pascutto  wrote:
>
> ... This heavy pruning
> by the policy network OTOH seems to be an issue for me. My program has
> big tactical holes.

Do you do any hard pruning? My engines (Steenvreter,Magog) always had a
move predictor (a.k.a. policy net), but I never saw the need to do hard
pruning. Steenvreter uses the predictions to set priors, and it is very
selective, but with infinite simulations eventually all potentially
relevant moves will get sampled.

Erik
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo watch parties planned across U.S.

2017-05-22 Thread Stephen Martindale

Interesting... but I don't see the links for "watch the games live", only
the schedule. I assume that the page will be updated when the games
actually start... or are about to.

Yours Faithfully,
*Stephen Martindale*

+49 162 577 2166
stephen.c.martind...@gmail.com

On 22 May 2017 at 09:07, Hideki Kato  wrote:

> Probably here.
> http://events.google.com/alphago2017/
> "Below, you can find out the game schedule, watch the games live
> or replay the highlights."
>
> Hideki
>
> Joshua Shriver:

Re: [Computer-go] mini-max with Policy and Value network

2017-05-22 Thread Gian-Carlo Pascutto

On 20/05/2017 22:26, Brian Sheppard via Computer-go wrote:
> Could use late-move reductions to eliminate the hard pruning. Given
> the accuracy rate of the policy network, I would guess that even move
> 2 should be reduced.
> 

The question I always ask is: what's the real difference between MCTS
with a small UCT constant and an alpha-beta search with heavy Late Move
Reductions? Are the explored trees really so different?

In any case, in my experiments Monte Carlo still gives a strong benefit,
even with a not so strong Monte Carlo part. IIRC it was the case for
AlphaGo too, and they used more training data for the value network than
is publicly available, and Zen reported the same: Monte Carlo is important.

The main problem is the "only top x moves part". Late Move Reductions
are very nice because there is never a full pruning. This heavy pruning
by the policy network OTOH seems to be an issue for me. My program has
big tactical holes.

-- 
GCP
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo watch parties planned across U.S.

2017-05-22 Thread Hideki Kato

Probably here.  
http://events.google.com/alphago2017/
"Below, you can find out the game schedule, watch the games live 
or replay the highlights."

Hideki

Joshua Shriver:

Re: [Computer-go] AlphaGo watch parties planned across U.S.

2017-05-22 Thread Joshua Shriver

Where can it be viewed online?

-Josh

On Sun, May 21, 2017 at 9:09 PM, Ray Tayek  wrote:
> http://www.usgo.org/news/2017/05/alphago-watch-parties-planned-across-u-s/
>
> looks like 10:30 pm monday.
>
> thanks
>
> --
> Honesty is a very expensive gift. So, don't expect it from cheap people -
> Warren Buffett
> http://tayek.com/
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaGo watch parties planned across U.S.

Re: [Computer-go] AlphaGo watch parties planned across U.S.

Re: [Computer-go] mini-max with Policy and Value network

Re: [Computer-go] AlphaGo watch parties planned across U.S.

Re: [Computer-go] AlphaGo watch parties planned across U.S.

Re: [Computer-go] mini-max with Policy and Value network

Re: [Computer-go] mini-max with Policy and Value network

Re: [Computer-go] mini-max with Policy and Value network

Re: [Computer-go] mini-max with Policy and Value network

Re: [Computer-go] mini-max with Policy and Value network

Re: [Computer-go] mini-max with Policy and Value network

Re: [Computer-go] AlphaGo watch parties planned across U.S.

Re: [Computer-go] mini-max with Policy and Value network

Re: [Computer-go] mini-max with Policy and Value network

Re: [Computer-go] mini-max with Policy and Value network

Re: [Computer-go] mini-max with Policy and Value network

Re: [Computer-go] AlphaGo watch parties planned across U.S.

Re: [Computer-go] mini-max with Policy and Value network

Re: [Computer-go] AlphaGo watch parties planned across U.S.

Re: [Computer-go] AlphaGo watch parties planned across U.S.

20 matches

Site Navigation

Mail list logo

Footer information