Re: [Computer-go] MuZero - new paper from DeepMind

2019-11-25 Thread Brian Sheppard via Computer-go
I read through that paper, but I admit that I didn't really get where the extra power comes from. -Original Message- From: valkyria To: Computer-go Sent: Mon, Nov 25, 2019 6:41 am Subject: [Computer-go] MuZero - new paper from DeepMind Hi, if anyone still get email from this list:

Re: [Computer-go] Indexing and Searching Go Positions -- Literature Wanted

2019-09-17 Thread Brian Sheppard via Computer-go
I remember a scheme (from Dave Dyer, IIRC) that indexed positions based on the points on which the 20th, 40th, 60th,... moves were made. IIRC it was nearly a unique key for pro positions. Best,Brian -Original Message- From: Erik van der Werf To: computer-go Sent: Tue, Sep 17, 2019

Re: [Computer-go] Accelerating Self-Play Learning in Go

2019-03-08 Thread Brian Sheppard via Computer-go
>> contrary to intuition built up from earlier-generation MCTS programs in Go, >> putting significant weight on score maximization rather than only >> win/loss seems to help. This narrative glosses over important nuances. Collectively we are trying to find the golden mean of cost efficiency...

Re: [Computer-go] PUCT formula

2018-03-09 Thread Brian Sheppard via Computer-go
Thanks for the explanation. I agree that there is no actual consistency in exploration terms across historical papers. I confirmed that the PUCT formulas across the AG, AGZ, and AZ papers are all consistent. That is unlikely to be an error. So now I am wondering whether the faster decay is

Re: [Computer-go] PUCT formula

2018-03-09 Thread Brian Sheppard via Computer-go
18 4:48 AM To: computer-go@computer-go.org Subject: Re: [Computer-go] PUCT formula On 08-03-18 18:47, Brian Sheppard via Computer-go wrote: > I recall that someone investigated this question, but I don’t recall > the result. What is the formula that AGZ actually uses? The one mentioned in their

[Computer-go] PUCT formula

2018-03-08 Thread Brian Sheppard via Computer-go
In the AGZ paper, there is a formula for what they call “a variant of the PUCT algorithm”, and they cite a paper from Christopher Rosin: http://gauss.ececs.uc.edu/Workshops/isaim2010/papers/rosin.pdf But that paper has a formula that he calls the PUCB formula, which incorporates the priors

Re: [Computer-go] On proper naming

2018-03-08 Thread Brian Sheppard via Computer-go
The technique originated with backgammon players in the late 1970's, who would roll out positions manually. Ron Tiekert (Scrabble champion) also applied the technique to Scrabble, and I took that idea for Maven. It seemed like people were using the terms interchangeably. -Original

Re: [Computer-go] 9x9 is last frontier?

2018-03-07 Thread Brian Sheppard via Computer-go
s attempt at Shogi using AlphaZero's method will turnout. regards, Daniel On Tue, Mar 6, 2018 at 9:41 AM, Brian Sheppard via Computer-go <computer-go@computer-go.org <mailto:computer-go@computer-go.org> > wrote: Training on Stockfish games is guarant

Re: [Computer-go] 9x9 is last frontier?

2018-03-06 Thread Brian Sheppard via Computer-go
Tue, Mar 6, 2018 at 9:41 AM, Brian Sheppard via Computer-go <computer-go@computer-go.org <mailto:computer-go@computer-go.org> > wrote: Training on Stockfish games is guaranteed to produce a blunder-fest, because there are no blunders in the training set and therefore the policy networ

Re: [Computer-go] 9x9 is last frontier?

2018-03-06 Thread Brian Sheppard via Computer-go
Training on Stockfish games is guaranteed to produce a blunder-fest, because there are no blunders in the training set and therefore the policy network never learns how to refute blunders. This is not a flaw in MCTS, but rather in the policy network. MCTS will eventually search every move

Re: [Computer-go] Project Leela Zero

2017-12-29 Thread Brian Sheppard via Computer-go
Seems like extraordinarily fast progress. Great to hear that. -Original Message- From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of "Ingo Althöfer" Sent: Friday, December 29, 2017 12:30 PM To: computer-go@computer-go.org Subject: [Computer-go] Project Leela Zero

Re: [Computer-go] AGZ Policy Head

2017-12-29 Thread Brian Sheppard via Computer-go
I agree that having special knowledge for "pass" is not a big compromise, but it would not meet the "zero knowledge" goal, no? -Original Message- From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Rémi Coulom Sent: Friday, December 29, 2017 7:50 AM To:

Re: [Computer-go] mcts and tactics

2017-12-19 Thread Brian Sheppard via Computer-go
>I wouldn't find it so surprising if eventually the 20 or 40 block networks >develop a set of convolutional channels that traces possible ladders >diagonally across the board. Learning the deep tactics is more-or-less guaranteed because of the interaction between search and evaluation

Re: [Computer-go] AlphaZero & Co. still far from perfect play ?

2017-12-08 Thread Brian Sheppard via Computer-go
Agreed. You can push this farther. If we define an “error” as a move that flips the W/L state of a Go game, then only the side that is currently winning can make an error. Let’s suppose that 6.5 komi is winning for Black. Then Black can make an error, and after he does then White can make

Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

2017-12-07 Thread Brian Sheppard via Computer-go
AZ scalability looks good in that diagram, and it is certainly a good start, but it only goes out through 10 sec/move. Also, if the hardware is 7x better for AZ than SF, then should we elongate the curve for AZ by 7x? Or compress the curve for SF by 7x? Or some combination? Or take the data at

Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

2017-12-07 Thread Brian Sheppard via Computer-go
@computer-go.org] On Behalf Of Gian-Carlo Pascutto Sent: Thursday, December 7, 2017 8:17 AM To: computer-go@computer-go.org Subject: Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm On 7/12/2017 13:20, Brian Sheppard via Computer

Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

2017-12-07 Thread Brian Sheppard via Computer-go
, December 7, 2017 4:13 AM To: computer-go@computer-go.org Subject: Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm On 06-12-17 22:29, Brian Sheppard via Computer-go wrote: > The chess result is 64-36: a 100 rating point edge! I th

Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

2017-12-06 Thread Brian Sheppard via Computer-go
Requiring a margin > 55% is a defense against a random result. A 55% score in a 400-game match is 2 sigma. But I like the AZ policy better, because it does not require arbitrary parameters. It also improves more fluidly by always drawing training examples from the current probability

Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

2017-12-06 Thread Brian Sheppard via Computer-go
The chess result is 64-36: a 100 rating point edge! I think the Stockfish open source project improved Stockfish by ~20 rating points in the last year. Given the number of people/computers involved, Stockfish’s annual effort level seems comparable to the AZ effort. Stockfish is really,

Re: [Computer-go] Significance of resignation in AGZ

2017-12-03 Thread Brian Sheppard via Computer-go
than zero, or is it imitating the best known algorithm inconvenient for your purposes? Best, -Chaz On Sat, Dec 2, 2017 at 7:31 PM, Brian Sheppard via Computer-go <computer-go@computer-go.org <mailto:computer-go@computer-go.org> > wrote: I implemented the ad hoc rule of not training on pos

Re: [Computer-go] Significance of resignation in AGZ

2017-12-02 Thread Brian Sheppard via Computer-go
be not to resign to early (even before not passing) Le 02/12/2017 à 18:17, Brian Sheppard via Computer-go a écrit : I have some hard data now. My network’s initial training reached the same performance in half the iterations. That is, the steepness of skill gain in the first day of training

Re: [Computer-go] Significance of resignation in AGZ

2017-12-02 Thread Brian Sheppard via Computer-go
, YMMV, etc. From: Brian Sheppard [mailto:sheppar...@aol.com] Sent: Friday, December 1, 2017 5:39 PM To: 'computer-go' <computer-go@computer-go.org> Subject: RE: [Computer-go] Significance of resignation in AGZ I didn’t measure precisely because as soon as I saw the training artif

Re: [Computer-go] Significance of resignation in AGZ

2017-12-01 Thread Brian Sheppard via Computer-go
20%) of its capacity to improve accuracy on positions that will not contribute to its ultimate strength. This applies to both ordering and evaluation aspects. From: Andy [mailto:andy.olsen...@gmail.com] Sent: Friday, December 1, 2017 4:55 PM To: Brian Sheppard <sheppar...@aol.com>;

[Computer-go] Significance of resignation in AGZ

2017-12-01 Thread Brian Sheppard via Computer-go
I have concluded that AGZ's policy of resigning "lost" games early is somewhat significant. Not as significant as using residual networks, for sure, but you wouldn't want to go without these advantages. The benefit cited in the paper is speed. Certainly a factor. I see two other advantages.

Re: [Computer-go] Is MCTS needed?

2017-11-16 Thread Brian Sheppard via Computer-go
State of the art in computer chess is alpha-beta search, but note that the search is very selective because of "late move reductions." A late move reduction is to reduce depth for moves after the first move generated in a node. For example, a simple implementation would be "search the first

Re: [Computer-go] Zero is weaker than Master!?

2017-10-26 Thread Brian Sheppard via Computer-go
I would add that "wild guesses based on not enough info" is an indispensable skill. -Original Message- From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Hideki Kato Sent: Thursday, October 26, 2017 10:17 AM To: computer-go@computer-go.org Subject: Re:

Re: [Computer-go] AlphaGo Zero SGF - Free Use or Copyright?

2017-10-26 Thread Brian Sheppard via Computer-go
On Behalf Of Robert Jasiek Sent: Thursday, October 26, 2017 10:17 AM To: computer-go@computer-go.org Subject: Re: [Computer-go] AlphaGo Zero SGF - Free Use or Copyright? On 26.10.2017 13:52, Brian Sheppard via Computer-go wrote: > MCTS is the glue that binds incompatible rules. This is, howeve

Re: [Computer-go] AlphaGo Zero SGF - Free Use or Copyright?

2017-10-26 Thread Brian Sheppard via Computer-go
Robert is right, but Robert seems to think this hasn't been done. Actually every prominent non-neural MCTS program since Mogo has been based on the exact design that Robert describes. The best of them achieve somewhat greater strength than Robert expects. MCTS is the glue that binds

Re: [Computer-go] Source code (Was: Reducing network size? (Was: AlphaGo Zero))

2017-10-25 Thread Brian Sheppard via Computer-go
I think it uses the champion network. That is, the training periodically generates a candidate, and there is a playoff against the current champion. If the candidate wins by more than 55% then a new champion is declared. Keeping a champion is an important mechanism, I believe. That creates

Re: [Computer-go] AlphaGo Zero

2017-10-19 Thread Brian Sheppard via Computer-go
So I am reading that residual networks are simply better than normal convolutional networks. There is a detailed write-up here: https://blog.waya.ai/deep-residual-learning-9610bb62c355 Summary: the residual network has a fixed connection that adds (with no scaling) the output of the previous

Re: [Computer-go] AlphaGo Zero

2017-10-18 Thread Brian Sheppard via Computer-go
-Carlo Pascutto Sent: Wednesday, October 18, 2017 5:40 PM To: computer-go@computer-go.org Subject: Re: [Computer-go] AlphaGo Zero On 18/10/2017 22:00, Brian Sheppard via Computer-go wrote: > This paper is required reading. When I read this team’s papers, I > think to myself “Wow, this is br

Re: [Computer-go] AlphaGo Zero

2017-10-18 Thread Brian Sheppard via Computer-go
Some thoughts toward the idea of general game-playing... One aspect of Go is ideally suited for visual NN: strong locality of reference. That is, stones affect stones that are nearby. I wonder whether the late emergence of ladder understanding within AlphaGo Zero is an artifact of the board

Re: [Computer-go] AlphaGo Zero

2017-10-18 Thread Brian Sheppard via Computer-go
October 18, 2017 4:38 PM To: computer-go@computer-go.org Subject: Re: [Computer-go] AlphaGo Zero On 18/10/2017 22:00, Brian Sheppard via Computer-go wrote: > A stunning result. The NN uses a standard vision architecture (no Go > adaptation beyond what is necessary to represent the game state).

Re: [Computer-go] AlphaGo Zero

2017-10-18 Thread Brian Sheppard via Computer-go
This paper is required reading. When I read this team’s papers, I think to myself “Wow, this is brilliant! And I think I see the next step.” When I read their next paper, they show me the next *three* steps. I can’t say enough good things about the quality of the work. A stunning result.

Re: [Computer-go] Alphago and solving Go

2017-08-06 Thread Brian Sheppard via Computer-go
PM To: Brian Sheppard <sheppar...@aol.com> Cc: computer-go <computer-go@computer-go.org> Subject: Re: [Computer-go] Alphago and solving Go This is semantics. Yes, in the limit of infinite time, it is brute-force. Meanwhile, in the real world, AlphaGo chooses to balance its finite

Re: [Computer-go] Alphago and solving Go

2017-08-06 Thread Brian Sheppard via Computer-go
[mailto:lightvec...@gmail.com] Sent: Sunday, August 6, 2017 2:54 PM To: Brian Sheppard <sheppar...@aol.com>; computer-go@computer-go.org Cc: Steven Clark <steven.p.cl...@gmail.com> Subject: Re: [Computer-go] Alphago and solving Go Saying in an unqualified way that AlphaGo is brute fo

Re: [Computer-go] Alphago and solving Go

2017-08-06 Thread Brian Sheppard via Computer-go
There is a definition of “brute force” on Wikipedia. Seems to me that MCTS fits. Deep Blue fits. From: Álvaro Begué [mailto:alvaro.be...@gmail.com] Sent: Sunday, August 6, 2017 2:56 PM To: Brian Sheppard <sheppar...@aol.com>; computer-go <computer-go@computer-go.org> Subject: R

Re: [Computer-go] Alphago and solving Go

2017-08-06 Thread Brian Sheppard via Computer-go
as possible in each node. It is in this respect that AlphaGo truly excels. (AlphaGo also excels at whole board evaluation, but that is a separate topic.) From: Steven Clark [mailto:steven.p.cl...@gmail.com] Sent: Sunday, August 6, 2017 1:14 PM To: Brian Sheppard <sheppar...@aol.com>; co

Re: [Computer-go] Alphago and solving Go

2017-08-06 Thread Brian Sheppard via Computer-go
Yes, AlphaGo is brute force. No it is impossible to solve Go. Perfect play looks a lot like AlphaGo in that you would not be able to tell the difference. But I think that AlphaGo still has 0% win rate against perfect play. My own best guess is that top humans make about 12 errors per game.

Re: [Computer-go] Possible idea - decay old simulations?

2017-07-24 Thread Brian Sheppard via Computer-go
>I haven't tried it, but (with the computer chess hat on) these kind of >proposals behave pretty badly when you get into situations where your >evaluation is off and there are horizon effects. In computer Go, this issue focuses on cases where the initial move ordering is bad. It isn't so much

Re: [Computer-go] Possible idea - decay old simulations?

2017-07-23 Thread Brian Sheppard via Computer-go
Yes. This is a long-known phenomenon. I was able to get improvements in Pebbles based on the idea of forgetting unsuccessful results. It has to be done somewhat carefully, because results propagate up the tree. But you can definitely make it work. I recall a paper published on this

Re: [Computer-go] Value network that doesn't want to learn.

2017-06-23 Thread Brian Sheppard via Computer-go
>... my value network was trained to tell me the game is balanced at the >beginning... :-) The best training policy is to select positions that correct errors. I used the policies below to train a backgammon NN. Together, they reduced the expected loss of the network by 50% (cut the error

Re: [Computer-go] mini-max with Policy and Value network

2017-05-22 Thread Brian Sheppard via Computer-go
er-go.org] On Behalf Of Gian-Carlo Pascutto Sent: Monday, May 22, 2017 4:08 AM To: computer-go@computer-go.org Subject: Re: [Computer-go] mini-max with Policy and Value network On 20/05/2017 22:26, Brian Sheppard via Computer-go wrote: > Could use late-move reductions to eliminate the hard pruning. Giv

Re: [Computer-go] mini-max with Policy and Value network

2017-05-20 Thread Brian Sheppard via Computer-go
Could use late-move reductions to eliminate the hard pruning. Given the accuracy rate of the policy network, I would guess that even move 2 should be reduced. -Original Message- From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Hiroshi Yamashita Sent:

Re: [Computer-go] Patterns and bad shape

2017-04-18 Thread Brian Sheppard via Computer-go
...@gmail.com] Sent: Tuesday, April 18, 2017 9:06 AM To: computer-go@computer-go.org; Brian Sheppard <sheppar...@aol.com> Subject: Re: [Computer-go] Patterns and bad shape Now, I love this idea. A super fast cheap pattern matcher can act as input into a neural network input layer in sort of

Re: [Computer-go] Patterns and bad shape

2017-04-18 Thread Brian Sheppard via Computer-go
Adding patterns is very cheap: encode the patterns as an if/else tree, and it is O(log n) to match. Pattern matching as such did not show up as a significant component of Pebbles. But that is mostly because all of the machinery that makes pattern-matching cheap (incremental updating of 3x3

Re: [Computer-go] dealing with multiple local optima

2017-02-24 Thread Brian Sheppard via Computer-go
networks have always been surprisingly successful at Go. Like Brugmann’s paper about Monte Carlo, which was underestimated for a long time. Sigh. Best, Brian From: Minjae Kim [mailto:xive...@gmail.com] Sent: Friday, February 24, 2017 11:56 AM To: Brian Sheppard <sheppar...@aol.

Re: [Computer-go] dealing with multiple local optima

2017-02-24 Thread Brian Sheppard via Computer-go
Neural networks always have a lot of local optima. Simply because they have a high degree of internal symmetry. That is, you can “permute” sets of coefficients and get the same function. Don’t think of starting with expert training as a way to avoid local optima. It is a way to start

Re: [Computer-go] Playout policy optimization

2017-02-12 Thread Brian Sheppard via Computer-go
If your database is composed of self-play games, then the likelihood maximization policy should gain strength rapidly, and there should be a way to have asymptotic optimality. (That is, the patterns alone will play a perfect game in the limit.) Specifically: play self-play games using an

Re: [Computer-go] AlphaGo rollout nakade patterns?

2017-01-31 Thread Brian Sheppard via Computer-go
understand how others improve their rollouts. I hope that i will be able to improve the rollouts at some point. Roel On 31 January 2017 at 17:21, Brian Sheppard via Computer-go <computer-go@computer-go.org> wrote: If a "diamond" pattern is centered on a 5x5 square, then you

Re: [Computer-go] AlphaGo rollout nakade patterns?

2017-01-31 Thread Brian Sheppard via Computer-go
tions are you referring to? m.v.g. Roel On 24 January 2017 at 12:57, Brian Sheppard via Computer-go <computer-go@computer-go.org> wrote: There are two issues: one is the shape and the other is the policy that the search should follow. Of course the vital point is a killing mo

Re: [Computer-go] AlphaGo rollout nakade patterns?

2017-01-24 Thread Brian Sheppard via Computer-go
computer-go.org Subject: Re: [Computer-go] AlphaGo rollout nakade patterns? On 23-01-17 20:10, Brian Sheppard via Computer-go wrote: > only captures of up to 9 stones can be nakade. I don't really understand this. http://senseis.xmp.net/?StraightThree Both constructing this shape and playing

Re: [Computer-go] AlphaGo rollout nakade patterns?

2017-01-23 Thread Brian Sheppard via Computer-go
A capturing move has a potential nakade if the string that was removed is among a limited set of possibilities. Probably Alpha Go has a 13-point bounding region (e.g., the 13-point star) that it uses as a positional index, and therefore a 8192-sized pattern set will identify all potential

Re: [Computer-go] Training the value network (a possibly more efficient approach)

2017-01-10 Thread Brian Sheppard
I was writing code along those lines when AlphaGo debuted. When it became clear that AlphaGo had succeeded, then I ceased work. So I don’t know whether this strategy will succeed, but the theoretical merits were good enough to encourage me. Best of luck, Brian From: Computer-go

Re: [Computer-go] Some experiences with CNN trained on moves by thewinning player

2016-12-20 Thread Brian Sheppard
The downside to training the moves of the winner is that you "throw away" half of the data. But that isn't so bad, because you can always get as much data as you need. The upside is that losing positions do not necessarily have a training signal. That is: there is no good move in a losing

Re: [Computer-go] Some experiences with CNN trained on moves by the winning player

2016-12-11 Thread Brian Sheppard
IMO, training using only the moves of winners is obviously the practical choice. Worst case: you "waste" half of your data. But that is actually not a downside provided that you have lots of data, and as your program strengthens you will avoid potential data-quality problems. Asymptotically,

Re: [Computer-go] Converging to 57%

2016-08-23 Thread Brian Sheppard
The learning rate seems much too high. My experience (which is from backgammon rather than Go, among other caveats) is that you need tiny learning rates. Tiny, as in 1/TrainingSetSize. Neural networks are dark magic. Be prepared to spend many weeks just trying to figure things out. You can

Re: [Computer-go] Crazystone on Steam

2016-05-27 Thread Brian Sheppard
Are you the Johannes Laire who works at Cimpress in WTR? -Original Message- From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Johannes Laire Sent: Friday, May 27, 2016 4:45 PM To: computer-go@computer-go.org Subject: Re: [Computer-go] Crazystone on Steam The

Re: [Computer-go] new challenge for Go programmers

2016-03-31 Thread Brian Sheppard
Ingo Althöfer" Sent: Thursday, March 31, 2016 4:31 AM To: computer-go@computer-go.org Subject: Re: [Computer-go] new challenge for Go programmers Hello all, "Brian Sheppard" <sheppar...@aol.com> wrote: > ... This is out of line, IMO. Djhbrown asked a sensible ques

Re: [Computer-go] new challenge for Go programmers

2016-03-30 Thread Brian Sheppard
This is out of line, IMO. Djhbrown asked a sensible question that has valuable intentions. I would like to see responsible, thoughtful, and constructive replies. From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of uurtamo . Sent: Wednesday, March 30, 2016 7:43 PM To:

Re: [Computer-go] new challenge for Go programmers

2016-03-30 Thread Brian Sheppard
Trouble is that it is very difficult to put certain concepts into mathematics. For instance: “well, I tried to find parameters that did a better job of minimizing that error function, but eventually I lost patience.” :-) Neural network parameters are not directly humanly understandable. They

Re: [Computer-go] Nice graph

2016-03-25 Thread Brian Sheppard
Hmm, seems to imply a 1000-Elo edge over human 9p. But such a player would literally never lose a game to a human. I take this as an example of the difficulty of extrapolating based on games against computers. (and the slide seems to have a disclaimer to this effect, if I am reading the text

Re: [Computer-go] AlphaGo & DCNN: Handling long-range dependency

2016-03-15 Thread Brian Sheppard
for championship caliber. -Original Message- From: Brian Sheppard [mailto:sheppar...@aol.com] Sent: Tuesday, March 15, 2016 6:20 PM To: 'computer-go@computer-go.org' <computer-go@computer-go.org> Subject: RE: [Computer-go] AlphaGo & DCNN: Handling long-range dependency >So

Re: [Computer-go] AlphaGo & DCNN: Handling long-range dependency

2016-03-15 Thread Brian Sheppard
>So a small error in the opening or middle game can literally be worth anything >by the time the game ends. These are my estimates: human pros >= 24 points lost, and >= 5 game-losing errors against other human pros. I relayed my experience of a comparable experiment with chess, and how those

Re: [Computer-go] AlphaGo & DCNN: Handling long-range dependency

2016-03-14 Thread Brian Sheppard
Maybe 10 years ago I extrapolated from the results of title matches to arrive at an estimate of perfect play. My memory is hazy about the conclusions (sorry), so I would love if someone is curious enough to build models and draw conclusions and share them. The basis of the estimate is that

Re: [Computer-go] Game 4: a rare insight

2016-03-13 Thread Brian Sheppard
I have the impression that the value network is used to initialize the score of a node to, say, 70% out of N trials. Then the MCTS is trial N+1, N+2, etc. Still asymptotically optimal, but if the value network is accurate then you have a big acceleration in accuracy because the scores start

Re: [Computer-go] Game 4: a rare insight

2016-03-13 Thread Brian Sheppard
I would not place too much confidence in the observers. Even though they are pro players, they don't have the same degree of concentration as the game's participants, and they have an obligation to speak about the game on a regular basis which further deteriorates analytic skills. Figuring out

Re: [Computer-go] Congratulations to AlphaGo

2016-03-12 Thread Brian Sheppard
Play on KGS. Pros can be anonymous, and test themselves and AlphaGo at the same time. :-) From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Jim O'Flaherty Sent: Saturday, March 12, 2016 4:56 PM To: computer-go@computer-go.org Subject: Re: [Computer-go]

Re: [Computer-go] Congratulations to AlphaGo

2016-03-12 Thread Brian Sheppard
Playing on KGS will give a good sense. Players have sustained ranks between 9 and 10 dan there, but never over 10 dan. If AlphaGo can sustain over 10 dan, then there is clear separation. -Original Message- From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of

Re: [Computer-go] AlphaGo & DCNN: Handling long-range dependency

2016-03-11 Thread Brian Sheppard
Actually chess software is much, much better. I recall that today's software running on 1998 hardware beats 1998 software running on today's hardware. It was very soon after 1998 that ordinary PCs could play on a par with world champions. -Original Message- From: Computer-go

Re: [Computer-go] Finding Alphago's Weaknesses

2016-03-10 Thread Brian Sheppard
Amen to Don Dailey. He would be so proud. From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Jim O'Flaherty Sent: Thursday, March 10, 2016 6:49 PM To: computer-go@computer-go.org Subject: Re: [Computer-go] Finding Alphago's Weaknesses I think we are going to see a

Re: [Computer-go] Neural Net move prediction

2016-02-04 Thread Brian Sheppard
The method is not likely to work, since the goal of NN training s to reduce the residual error to a random function of the NN inputs. If the NN succeeds at this, then there will be no signal to train against. If the NN fails, then it could be because the NN is not large enough, or because there

Re: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search

2016-02-01 Thread Brian Sheppard
You play until neither player wishes to make a move. The players are willing to move on any point that is not self-atari, and they are willing to make self-atari plays if capture would result in a Nakade (http://senseis.xmp.net/?Nakade) This correctly plays seki. From: Computer-go

Re: [Computer-go] Game Over

2016-01-28 Thread Brian Sheppard
seem to work quite well. There is a paper by him on this topic in the proceedings of the conference Advances in Computer Games 2015 (in Leiden , NL), published by Springer Lecture Notes in Computer Science (LNCS). A very early application of truncated rollouts was applied by Brian Sheppa

Re: [Computer-go] alternative for cgos

2015-01-13 Thread Brian Sheppard
Alas, the limitations of a hosted environment would make participating in Baduk.io a non-starter for me. It is much better to build a selection of standard bots and run tests on my own hardware. Which is what I have done in the absence of CGOS. IMO, running using one’s own

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-23 Thread Brian Sheppard
A 3-layer network (input, hidden, output) is sufficient to be a universal function approximator, so from a theoretical perspective only 3 layers are necessary. But the gap between theoretical and practical is quite large. The CNN architecture builds in translation invariance and sensitivity

Re: [Computer-go] Teaching Deep Convolutional Neural Networks to Play Go

2014-12-19 Thread Brian Sheppard
This is Aya's move predictor(W) vs GNU Go(B). http://eidogo.com/#3BNw8ez0R I think previous move effect is too strong. This is a good example of why a good playout engine will not necessarily play well. The purpose of the playout policy is to *balance* errors. Following your opponent's last

Re: [Computer-go] Teaching Deep Convolutional Neural Networks to Play Go

2014-12-16 Thread Brian Sheppard
? If the former, it had better be reasonably fast. The latter approach can be far slower, but requires the predictions to be of much higher quality. And the biggest question, how can you make these two approaches interact efficiently? René On Mon, Dec 15, 2014 at 8:00 PM, Brian Sheppard

Re: [Computer-go] Teaching Deep Convolutional Neural Networks to Play Go

2014-12-16 Thread Brian Sheppard
will enjoy reading it. Regards, Aja On Tue, Dec 16, 2014 at 2:03 PM, Brian Sheppard sheppar...@aol.com wrote: My impression is that each feature gets a single weight in Crazy-stone. The team-of-features aspect arises because a single point can match several patterns, so you need a model

Re: [Computer-go] Teaching Deep Convolutional Neural Networks to Play Go

2014-12-15 Thread Brian Sheppard
So I read this kind of study with some skepticism. My guess is that the large-scale pattern systems in use by leading programs are already pretty good for their purpose (i.e., progressive bias). Rampant personal and unverified speculation follows...

Re: [Computer-go] Teaching Deep Convolutional Neural Networks to Play Go

2014-12-15 Thread Brian Sheppard
Is it really such a burden? Well, I have to place my bets on some things and not on others. It seems to me that the costs of a NN must be higher than a system based on decision trees. The convolution NN has a very large parameter space if my reading of the paper is correct. Specifically,

[computer-go] Paper about Mogo's Opening Strategy

2010-01-15 Thread Brian Sheppard
I recommend the paper http://hal.inria.fr/docs/00/36/97/83/PDF/ouvertures9x9.pdf by the Mogo team, which describes how to use a grid to compute Mogo's opening book using coevolution. Brian ___ computer-go mailing list computer-go@computer-go.org

[computer-go] Optimizing combinations of flags

2009-11-24 Thread Brian Sheppard
What do you do when you add a new parameter? Do you retain your existing 'history', considering each game to have been played with the value of the new parameter set to zero? Yes, exactly. If you have 50 parameters already, doesn't adding a new parameter create a rather large number of new

[computer-go] Optimizing combinations of flags

2009-11-23 Thread Brian Sheppard
From what I understand, for each parameter you take with some high probability the best so far, and with some lower probability the least tried one. This requires (manually) enumerating all parameters on some integer scale, if I got it correctly. Yes, for each parameter you make a range of values

[computer-go] Optimizing combinations of flags

2009-11-23 Thread Brian Sheppard
Your system seems very interesting but it seems to me that you assume that each parameters are independant. What happen if, for example, two parameters works well when only one of the is active and badly if the two are actives at the same time ? I think that I am assuming only that the objective

[computer-go] Optiizing combinations of flags

2009-11-20 Thread Brian Sheppard
On that topic, I have around 17 flag who enable or not features in my pure playouts bots, and I want to search the best combinations of them. I known this is almost a dream but does anyone know the best way to approximate this. Pebbles randomly chooses (using a zero asymptotic regret strategy)

[computer-go] A Tenure Track Position in Japan

2009-11-05 Thread Brian Sheppard
Man, I just don't live in the right location. First Paris, and now Japan. Can't a position open in, say, Boston? :-( ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/

[computer-go] MPI vs Thread-safe

2009-11-01 Thread Brian Sheppard
The PRNG (pseudo random number generator) can cause the super-linear speed-up, for example. Really? The serial simulation argument seems to apply. ___ computer-go mailing list computer-go@computer-go.org

[computer-go] MPI vs Thread-safe

2009-10-30 Thread Brian Sheppard
I personally just use root parallelization in Pachi I think this answers my question; each core in Pachi independently explores a tree, and the master thread merges the data. This is even though you have shared memory on your machine. Have you read the Parallel Monte-Carlo Tree Search paper?

[computer-go] Reservations about Parallelization Strategy

2009-10-30 Thread Brian Sheppard
While re-reading the parallelization papers, I tried to formulate why I thought that they couldn't be right. The issue with Mango reporting a super-linear speed-up was an obvious red flag, but that doesn't mean that their conclusions were wrong. It just means that Mango's exploration policy needs

[computer-go] MPI vs Thread-safe

2009-10-30 Thread Brian Sheppard
Back of envelope calculation: MFG processes 5K nodes/sec/core * 4 cores per process = 20K nodes/sec/process. Four processes makes 80K nodes/sec. If you think for 30 seconds (pondering + move time) then you are at 2.4 million nodes. Figure about 25,000 nodes having 100 visits or more. UCT data is

[computer-go] MPI vs Thread-safe

2009-10-29 Thread Brian Sheppard
I have a question for those who have parallelized programs. It seems like MPI is the obvious architecture when scaling a program to multiple machines. Let's assume that we implement a program that has that capability. Now, it is possible to use MPI for scaling *within* a compute node. For

[computer-go] NVidia Fermi and go?

2009-10-23 Thread Brian Sheppard
I just wondered if this new Fermi GPU solves the issues for go playouts, or don't really make any difference? My first impression of Fermi is very positive. Fermi contains a lot of features that make general purpose computing on a GPU much easier and better performing. However, it remains the

[computer-go] NVidia Fermi and go?

2009-10-23 Thread Brian Sheppard
In my own gpu experiment (light playouts), registers/memory were the bounding factors on simulation speed. I respect your experimental finding, but I note that you have carefully specified light playouts, probably because you suspect that there may be a significant difference if playouts are

[computer-go] NVidia Fermi and go?

2009-10-23 Thread Brian Sheppard
BTW, it occurs to me that we can approximate the efficiency of parallelization by taking execution counts from a profiler and post-processing them. I should do that before buying a new GPU. :-) I wonder what you mean by that. If you run your program on a sequential machine and count

[computer-go] Pattern matching

2009-10-13 Thread Brian Sheppard
Someone commented on it, so it must have come across OK. Yes, that is what I concluded. Is it possible that your e-mail provider or e-mail reader decided to not include the attachment? At least that's what 'skipped content' seems to indicate. I am reading in IE, by surfing to

[computer-go] Rating variability on CGOS

2009-10-08 Thread Brian Sheppard
About two weeks ago I took Pebbles offline for an extensive overhaul of its board representation. At that time Valkyria 3.3.4 had a 9x9 CGOS rating of roughly 2475. When I looked today, I saw Valkyria 3.3.4 rated at roughly 2334, so I wondered what was going on. I found a contributing factor:

[computer-go] Rating variability on CGOS

2009-10-08 Thread Brian Sheppard
One must be very careful about proclaiming wild transitivity issues. I'm not saying it's not an issue, there is some going on with every program on CGOS, but with less than 500 games between any two players you are going to get error margins of +/- 30-50 ELO or something like that. Actually we

[computer-go] Long superko example

2009-10-05 Thread Brian Sheppard
The sequence is as interesting as the superko, really, because it's the sort of strange sequence that would only occur to monte carlo programs. Yes, that is why I posted it. I'm very curious what black thought the win rate was before B9... I do not know. This position wasn't from a game. It

[computer-go] Long superko example

2009-10-04 Thread Brian Sheppard
1 2 3 4 5 6 7 8 9 A O O O X - X X O - B - O O X X O O - - C O - O X X - O O O D - O X X O O O X O E O O O O X X O X X F X O O X X X X X - G X X O O X O X - X H - X X - O O O X - J X X - O - X X - X X: B9 (hole-of-three) O: A9 (atari) X: B8 (capture) O: A8 (atari) X: A9 (capture) O: A8

  1   2   >