Re: [Computer-go] MuZero - new paper from DeepMind

2019-11-25 Thread Brian Sheppard via Computer-go
I read through that paper, but I admit that I didn't really get where the extra power comes from. -Original Message- From: valkyria To: Computer-go Sent: Mon, Nov 25, 2019 6:41 am Subject: [Computer-go] MuZero - new paper from DeepMind Hi, if anyone still get email from this list: D

Re: [Computer-go] Indexing and Searching Go Positions -- Literature Wanted

2019-09-17 Thread Brian Sheppard via Computer-go
I remember a scheme (from Dave Dyer, IIRC) that indexed positions based on the points on which the 20th, 40th, 60th,... moves were made. IIRC it was nearly a unique key for pro positions. Best,Brian -Original Message- From: Erik van der Werf To: computer-go Sent: Tue, Sep 17, 2019 5:5

Re: [Computer-go] Accelerating Self-Play Learning in Go

2019-03-08 Thread Brian Sheppard via Computer-go
>> contrary to intuition built up from earlier-generation MCTS programs in Go, >> putting significant weight on score maximization rather than only >> win/loss seems to help. This narrative glosses over important nuances. Collectively we are trying to find the golden mean of cost efficiency...

Re: [Computer-go] PUCT formula

2018-03-09 Thread Brian Sheppard via Computer-go
Thanks for the explanation. I agree that there is no actual consistency in exploration terms across historical papers. I confirmed that the PUCT formulas across the AG, AGZ, and AZ papers are all consistent. That is unlikely to be an error. So now I am wondering whether the faster decay is usef

Re: [Computer-go] PUCT formula

2018-03-09 Thread Brian Sheppard via Computer-go
Subject: Re: [Computer-go] PUCT formula On 08-03-18 18:47, Brian Sheppard via Computer-go wrote: > I recall that someone investigated this question, but I don’t recall > the result. What is the formula that AGZ actually uses? The one mentioned in their paper, I assume. I investigated both

[Computer-go] PUCT formula

2018-03-08 Thread Brian Sheppard via Computer-go
In the AGZ paper, there is a formula for what they call “a variant of the PUCT algorithm”, and they cite a paper from Christopher Rosin: http://gauss.ececs.uc.edu/Workshops/isaim2010/papers/rosin.pdf But that paper has a formula that he calls the PUCB formula, which incorporates the priors i

Re: [Computer-go] On proper naming

2018-03-08 Thread Brian Sheppard via Computer-go
The technique originated with backgammon players in the late 1970's, who would roll out positions manually. Ron Tiekert (Scrabble champion) also applied the technique to Scrabble, and I took that idea for Maven. It seemed like people were using the terms interchangeably. -Original Message--

Re: [Computer-go] 9x9 is last frontier?

2018-03-07 Thread Brian Sheppard via Computer-go
(as well as many others) with MCTS for tactical dominated games is bad, and there must be some breakthrough in that regard in AlphaZero for it to be able to compete with Stockfish on a tactical level. I am curious how Remi's attempt at Shogi using AlphaZero's method will turnout.

Re: [Computer-go] 9x9 is last frontier?

2018-03-06 Thread Brian Sheppard via Computer-go
On Tue, Mar 6, 2018 at 9:41 AM, Brian Sheppard via Computer-go mailto:computer-go@computer-go.org> > wrote: Training on Stockfish games is guaranteed to produce a blunder-fest, because there are no blunders in the training set and therefore the policy network never learn

Re: [Computer-go] 9x9 is last frontier?

2018-03-06 Thread Brian Sheppard via Computer-go
Training on Stockfish games is guaranteed to produce a blunder-fest, because there are no blunders in the training set and therefore the policy network never learns how to refute blunders. This is not a flaw in MCTS, but rather in the policy network. MCTS will eventually search every move in

Re: [Computer-go] Project Leela Zero

2017-12-29 Thread Brian Sheppard via Computer-go
Seems like extraordinarily fast progress. Great to hear that. -Original Message- From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of "Ingo Althöfer" Sent: Friday, December 29, 2017 12:30 PM To: computer-go@computer-go.org Subject: [Computer-go] Project Leela Zero

Re: [Computer-go] AGZ Policy Head

2017-12-29 Thread Brian Sheppard via Computer-go
I agree that having special knowledge for "pass" is not a big compromise, but it would not meet the "zero knowledge" goal, no? -Original Message- From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Rémi Coulom Sent: Friday, December 29, 2017 7:50 AM To: computer-g

Re: [Computer-go] mcts and tactics

2017-12-19 Thread Brian Sheppard via Computer-go
>I wouldn't find it so surprising if eventually the 20 or 40 block networks >develop a set of convolutional channels that traces possible ladders >diagonally across the board. Learning the deep tactics is more-or-less guaranteed because of the interaction between search and evaluation throug

Re: [Computer-go] AlphaZero & Co. still far from perfect play ?

2017-12-08 Thread Brian Sheppard via Computer-go
Agreed. You can push this farther. If we define an “error” as a move that flips the W/L state of a Go game, then only the side that is currently winning can make an error. Let’s suppose that 6.5 komi is winning for Black. Then Black can make an error, and after he does then White can make an

Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

2017-12-07 Thread Brian Sheppard via Computer-go
AZ scalability looks good in that diagram, and it is certainly a good start, but it only goes out through 10 sec/move. Also, if the hardware is 7x better for AZ than SF, then should we elongate the curve for AZ by 7x? Or compress the curve for SF by 7x? Or some combination? Or take the data at f

Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

2017-12-07 Thread Brian Sheppard via Computer-go
@computer-go.org] On Behalf Of Gian-Carlo Pascutto Sent: Thursday, December 7, 2017 8:17 AM To: computer-go@computer-go.org Subject: Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm On 7/12/2017 13:20, Brian Sheppard via Computer-go wro

Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

2017-12-07 Thread Brian Sheppard via Computer-go
o Pascutto Sent: Thursday, December 7, 2017 4:13 AM To: computer-go@computer-go.org Subject: Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm On 06-12-17 22:29, Brian Sheppard via Computer-go wrote: > The chess result is 64-36: a 100 ra

Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

2017-12-06 Thread Brian Sheppard via Computer-go
he result > was statistically significant > 0. (+35 Elo over 400 games... Brian Sheppard also: > Requiring a margin > 55% is a defense against a random result. A 55% > score in a 400-game match is 2 sigma. Good point. That makes sense. But (where A is best so far, and B is the newer netw

Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

2017-12-06 Thread Brian Sheppard via Computer-go
Requiring a margin > 55% is a defense against a random result. A 55% score in a 400-game match is 2 sigma. But I like the AZ policy better, because it does not require arbitrary parameters. It also improves more fluidly by always drawing training examples from the current probability distributi

Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

2017-12-06 Thread Brian Sheppard via Computer-go
The chess result is 64-36: a 100 rating point edge! I think the Stockfish open source project improved Stockfish by ~20 rating points in the last year. Given the number of people/computers involved, Stockfish’s annual effort level seems comparable to the AZ effort. Stockfish is really, reall

Re: [Computer-go] Significance of resignation in AGZ

2017-12-03 Thread Brian Sheppard via Computer-go
, or is it imitating the best known algorithm inconvenient for your purposes? Best, -Chaz On Sat, Dec 2, 2017 at 7:31 PM, Brian Sheppard via Computer-go mailto:computer-go@computer-go.org> > wrote: I implemented the ad hoc rule of not training on positions after the first pass, and my prog

Re: [Computer-go] Significance of resignation in AGZ

2017-12-02 Thread Brian Sheppard via Computer-go
not to resign to early (even before not passing) Le 02/12/2017 à 18:17, Brian Sheppard via Computer-go a écrit : I have some hard data now. My network’s initial training reached the same performance in half the iterations. That is, the steepness of skill gain in the first day of training was

Re: [Computer-go] Significance of resignation in AGZ

2017-12-02 Thread Brian Sheppard via Computer-go
, YMMV, etc. From: Brian Sheppard [mailto:sheppar...@aol.com] Sent: Friday, December 1, 2017 5:39 PM To: 'computer-go' Subject: RE: [Computer-go] Significance of resignation in AGZ I didn’t measure precisely because as soon as I saw the training artifacts I changed the code. An

Re: [Computer-go] Significance of resignation in AGZ

2017-12-01 Thread Brian Sheppard via Computer-go
of its capacity to improve accuracy on positions that will not contribute to its ultimate strength. This applies to both ordering and evaluation aspects. From: Andy [mailto:andy.olsen...@gmail.com] Sent: Friday, December 1, 2017 4:55 PM To: Brian Sheppard ; computer-go Subject:

[Computer-go] Significance of resignation in AGZ

2017-12-01 Thread Brian Sheppard via Computer-go
I have concluded that AGZ's policy of resigning "lost" games early is somewhat significant. Not as significant as using residual networks, for sure, but you wouldn't want to go without these advantages. The benefit cited in the paper is speed. Certainly a factor. I see two other advantages. Fi

Re: [Computer-go] Is MCTS needed?

2017-11-16 Thread Brian Sheppard via Computer-go
State of the art in computer chess is alpha-beta search, but note that the search is very selective because of "late move reductions." A late move reduction is to reduce depth for moves after the first move generated in a node. For example, a simple implementation would be "search the first mov

Re: [Computer-go] Zero is weaker than Master!?

2017-10-26 Thread Brian Sheppard via Computer-go
I would add that "wild guesses based on not enough info" is an indispensable skill. -Original Message- From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Hideki Kato Sent: Thursday, October 26, 2017 10:17 AM To: computer-go@computer-go.org Subject: Re: [Computer-

Re: [Computer-go] AlphaGo Zero SGF - Free Use or Copyright?

2017-10-26 Thread Brian Sheppard via Computer-go
g] On Behalf Of Robert Jasiek Sent: Thursday, October 26, 2017 10:17 AM To: computer-go@computer-go.org Subject: Re: [Computer-go] AlphaGo Zero SGF - Free Use or Copyright? On 26.10.2017 13:52, Brian Sheppard via Computer-go wrote: > MCTS is the glue that binds incompatible rules. This is, h

Re: [Computer-go] AlphaGo Zero SGF - Free Use or Copyright?

2017-10-26 Thread Brian Sheppard via Computer-go
Robert is right, but Robert seems to think this hasn't been done. Actually every prominent non-neural MCTS program since Mogo has been based on the exact design that Robert describes. The best of them achieve somewhat greater strength than Robert expects. MCTS is the glue that binds incompatibl

Re: [Computer-go] Source code (Was: Reducing network size? (Was: AlphaGo Zero))

2017-10-25 Thread Brian Sheppard via Computer-go
I think it uses the champion network. That is, the training periodically generates a candidate, and there is a playoff against the current champion. If the candidate wins by more than 55% then a new champion is declared. Keeping a champion is an important mechanism, I believe. That creates th

Re: [Computer-go] AlphaGo Zero

2017-10-19 Thread Brian Sheppard via Computer-go
So I am reading that residual networks are simply better than normal convolutional networks. There is a detailed write-up here: https://blog.waya.ai/deep-residual-learning-9610bb62c355 Summary: the residual network has a fixed connection that adds (with no scaling) the output of the previous le

Re: [Computer-go] AlphaGo Zero

2017-10-18 Thread Brian Sheppard via Computer-go
o.org] On Behalf Of Gian-Carlo Pascutto Sent: Wednesday, October 18, 2017 5:40 PM To: computer-go@computer-go.org Subject: Re: [Computer-go] AlphaGo Zero On 18/10/2017 22:00, Brian Sheppard via Computer-go wrote: > This paper is required reading. When I read this team’s papers, I > think to my

Re: [Computer-go] AlphaGo Zero

2017-10-18 Thread Brian Sheppard via Computer-go
Some thoughts toward the idea of general game-playing... One aspect of Go is ideally suited for visual NN: strong locality of reference. That is, stones affect stones that are nearby. I wonder whether the late emergence of ladder understanding within AlphaGo Zero is an artifact of the board re

Re: [Computer-go] AlphaGo Zero

2017-10-18 Thread Brian Sheppard via Computer-go
ednesday, October 18, 2017 4:38 PM To: computer-go@computer-go.org Subject: Re: [Computer-go] AlphaGo Zero On 18/10/2017 22:00, Brian Sheppard via Computer-go wrote: > A stunning result. The NN uses a standard vision architecture (no Go > adaptation beyond what is necessary to represent the ga

Re: [Computer-go] AlphaGo Zero

2017-10-18 Thread Brian Sheppard via Computer-go
This paper is required reading. When I read this team’s papers, I think to myself “Wow, this is brilliant! And I think I see the next step.” When I read their next paper, they show me the next *three* steps. I can’t say enough good things about the quality of the work. A stunning result. The

Re: [Computer-go] Alphago and solving Go

2017-08-06 Thread Brian Sheppard via Computer-go
PM To: Brian Sheppard Cc: computer-go Subject: Re: [Computer-go] Alphago and solving Go This is semantics. Yes, in the limit of infinite time, it is brute-force. Meanwhile, in the real world, AlphaGo chooses to balance its finite time budget between depth & width. The mere fact that the

Re: [Computer-go] Alphago and solving Go

2017-08-06 Thread Brian Sheppard via Computer-go
[mailto:lightvec...@gmail.com] Sent: Sunday, August 6, 2017 2:54 PM To: Brian Sheppard ; computer-go@computer-go.org Cc: Steven Clark Subject: Re: [Computer-go] Alphago and solving Go Saying in an unqualified way that AlphaGo is brute force is wrong in the spirit of the question. Assuming AlphaGo uses a

Re: [Computer-go] Alphago and solving Go

2017-08-06 Thread Brian Sheppard via Computer-go
There is a definition of “brute force” on Wikipedia. Seems to me that MCTS fits. Deep Blue fits. From: Álvaro Begué [mailto:alvaro.be...@gmail.com] Sent: Sunday, August 6, 2017 2:56 PM To: Brian Sheppard ; computer-go Subject: Re: [Computer-go] Alphago and solving Go It was already a

Re: [Computer-go] Alphago and solving Go

2017-08-06 Thread Brian Sheppard via Computer-go
possible in each node. It is in this respect that AlphaGo truly excels. (AlphaGo also excels at whole board evaluation, but that is a separate topic.) From: Steven Clark [mailto:steven.p.cl...@gmail.com] Sent: Sunday, August 6, 2017 1:14 PM To: Brian Sheppard ; computer-go Subject: Re

Re: [Computer-go] Alphago and solving Go

2017-08-06 Thread Brian Sheppard via Computer-go
Yes, AlphaGo is brute force. No it is impossible to solve Go. Perfect play looks a lot like AlphaGo in that you would not be able to tell the difference. But I think that AlphaGo still has 0% win rate against perfect play. My own best guess is that top humans make about 12 errors per game. T

Re: [Computer-go] Possible idea - decay old simulations?

2017-07-24 Thread Brian Sheppard via Computer-go
>I haven't tried it, but (with the computer chess hat on) these kind of >proposals behave pretty badly when you get into situations where your >evaluation is off and there are horizon effects. In computer Go, this issue focuses on cases where the initial move ordering is bad. It isn't so much e

Re: [Computer-go] Possible idea - decay old simulations?

2017-07-23 Thread Brian Sheppard via Computer-go
Yes. This is a long-known phenomenon. I was able to get improvements in Pebbles based on the idea of forgetting unsuccessful results. It has to be done somewhat carefully, because results propagate up the tree. But you can definitely make it work. I recall a paper published on this basis.

Re: [Computer-go] Value network that doesn't want to learn.

2017-06-23 Thread Brian Sheppard via Computer-go
>... my value network was trained to tell me the game is balanced at the >beginning... :-) The best training policy is to select positions that correct errors. I used the policies below to train a backgammon NN. Together, they reduced the expected loss of the network by 50% (cut the error rate

Re: [Computer-go] mini-max with Policy and Value network

2017-05-22 Thread Brian Sheppard via Computer-go
uter-go.org] On Behalf Of Gian-Carlo Pascutto Sent: Monday, May 22, 2017 4:08 AM To: computer-go@computer-go.org Subject: Re: [Computer-go] mini-max with Policy and Value network On 20/05/2017 22:26, Brian Sheppard via Computer-go wrote: > Could use late-move reductions to eliminate the hard pruning

Re: [Computer-go] mini-max with Policy and Value network

2017-05-20 Thread Brian Sheppard via Computer-go
Could use late-move reductions to eliminate the hard pruning. Given the accuracy rate of the policy network, I would guess that even move 2 should be reduced. -Original Message- From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Hiroshi Yamashita Sent: Saturday,

Re: [Computer-go] Patterns and bad shape

2017-04-18 Thread Brian Sheppard via Computer-go
herty...@gmail.com] Sent: Tuesday, April 18, 2017 9:06 AM To: computer-go@computer-go.org; Brian Sheppard Subject: Re: [Computer-go] Patterns and bad shape Now, I love this idea. A super fast cheap pattern matcher can act as input into a neural network input layer in sort of a "pay additional

Re: [Computer-go] Patterns and bad shape

2017-04-18 Thread Brian Sheppard via Computer-go
Adding patterns is very cheap: encode the patterns as an if/else tree, and it is O(log n) to match. Pattern matching as such did not show up as a significant component of Pebbles. But that is mostly because all of the machinery that makes pattern-matching cheap (incremental updating of 3x3 n

Re: [Computer-go] dealing with multiple local optima

2017-02-24 Thread Brian Sheppard via Computer-go
networks have always been surprisingly successful at Go. Like Brugmann’s paper about Monte Carlo, which was underestimated for a long time. Sigh. Best, Brian From: Minjae Kim [mailto:xive...@gmail.com] Sent: Friday, February 24, 2017 11:56 AM To: Brian Sheppard ; computer-go@computer

Re: [Computer-go] dealing with multiple local optima

2017-02-24 Thread Brian Sheppard via Computer-go
Neural networks always have a lot of local optima. Simply because they have a high degree of internal symmetry. That is, you can “permute” sets of coefficients and get the same function. Don’t think of starting with expert training as a way to avoid local optima. It is a way to start trainin

Re: [Computer-go] Playout policy optimization

2017-02-12 Thread Brian Sheppard via Computer-go
If your database is composed of self-play games, then the likelihood maximization policy should gain strength rapidly, and there should be a way to have asymptotic optimality. (That is, the patterns alone will play a perfect game in the limit.) Specifically: play self-play games using an asy

Re: [Computer-go] AlphaGo rollout nakade patterns?

2017-01-31 Thread Brian Sheppard via Computer-go
be able to improve the rollouts at some point. Roel On 31 January 2017 at 17:21, Brian Sheppard via Computer-go wrote: If a "diamond" pattern is centered on a 5x5 square, then you have 13 points. The diagram below will give the idea. __+__ _+++_ + _+++_ __+__ At o

Re: [Computer-go] AlphaGo rollout nakade patterns?

2017-01-31 Thread Brian Sheppard via Computer-go
identify all potential nakade." seems to indicate this is a known pattern set? could i find some more information on it somewhere? also i was unable to find Pebbles, is it open source? @Robert Jasiek what definitions/papers/publications are you referring to? m.v.g. Roel On 24 Januar

Re: [Computer-go] AlphaGo rollout nakade patterns?

2017-01-24 Thread Brian Sheppard via Computer-go
r-go@computer-go.org Subject: Re: [Computer-go] AlphaGo rollout nakade patterns? On 23-01-17 20:10, Brian Sheppard via Computer-go wrote: > only captures of up to 9 stones can be nakade. I don't really understand this. http://senseis.xmp.net/?StraightThree Both constructing this shape and pl

Re: [Computer-go] AlphaGo rollout nakade patterns?

2017-01-23 Thread Brian Sheppard via Computer-go
A capturing move has a potential nakade if the string that was removed is among a limited set of possibilities. Probably Alpha Go has a 13-point bounding region (e.g., the 13-point star) that it uses as a positional index, and therefore a 8192-sized pattern set will identify all potential nakad

Re: [Computer-go] Training the value network (a possibly more efficient approach)

2017-01-10 Thread Brian Sheppard
I was writing code along those lines when AlphaGo debuted. When it became clear that AlphaGo had succeeded, then I ceased work. So I don’t know whether this strategy will succeed, but the theoretical merits were good enough to encourage me. Best of luck, Brian From: Computer-go [mail

Re: [Computer-go] Some experiences with CNN trained on moves by thewinning player

2016-12-20 Thread Brian Sheppard
The downside to training the moves of the winner is that you "throw away" half of the data. But that isn't so bad, because you can always get as much data as you need. The upside is that losing positions do not necessarily have a training signal. That is: there is no good move in a losing posit

Re: [Computer-go] Some experiences with CNN trained on moves by the winning player

2016-12-11 Thread Brian Sheppard
IMO, training using only the moves of winners is obviously the practical choice. Worst case: you "waste" half of your data. But that is actually not a downside provided that you have lots of data, and as your program strengthens you will avoid potential data-quality problems. Asymptotically, yo

Re: [Computer-go] Converging to 57%

2016-08-23 Thread Brian Sheppard
The learning rate seems much too high. My experience (which is from backgammon rather than Go, among other caveats) is that you need tiny learning rates. Tiny, as in 1/TrainingSetSize. Neural networks are dark magic. Be prepared to spend many weeks just trying to figure things out. You can b

Re: [Computer-go] Crazystone on Steam

2016-05-27 Thread Brian Sheppard
Are you the Johannes Laire who works at Cimpress in WTR? -Original Message- From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Johannes Laire Sent: Friday, May 27, 2016 4:45 PM To: computer-go@computer-go.org Subject: Re: [Computer-go] Crazystone on Steam The 201

Re: [Computer-go] new challenge for Go programmers

2016-03-31 Thread Brian Sheppard
Ingo Althöfer" Sent: Thursday, March 31, 2016 4:31 AM To: computer-go@computer-go.org Subject: Re: [Computer-go] new challenge for Go programmers Hello all, "Brian Sheppard" wrote: > ... This is out of line, IMO. Djhbrown asked a sensible question that > has valuab

Re: [Computer-go] new challenge for Go programmers

2016-03-30 Thread Brian Sheppard
This is out of line, IMO. Djhbrown asked a sensible question that has valuable intentions. I would like to see responsible, thoughtful, and constructive replies. From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of uurtamo . Sent: Wednesday, March 30, 2016 7:43 PM To:

Re: [Computer-go] new challenge for Go programmers

2016-03-30 Thread Brian Sheppard
Trouble is that it is very difficult to put certain concepts into mathematics. For instance: “well, I tried to find parameters that did a better job of minimizing that error function, but eventually I lost patience.” :-) Neural network parameters are not directly humanly understandable. They

Re: [Computer-go] Nice graph

2016-03-25 Thread Brian Sheppard
Hmm, seems to imply a 1000-Elo edge over human 9p. But such a player would literally never lose a game to a human. I take this as an example of the difficulty of extrapolating based on games against computers. (and the slide seems to have a disclaimer to this effect, if I am reading the text on

Re: [Computer-go] AlphaGo & DCNN: Handling long-range dependency

2016-03-15 Thread Brian Sheppard
ches to contend for championship caliber. -Original Message----- From: Brian Sheppard [mailto:sheppar...@aol.com] Sent: Tuesday, March 15, 2016 6:20 PM To: 'computer-go@computer-go.org' Subject: RE: [Computer-go] AlphaGo & DCNN: Handling long-range dependency >So a small err

Re: [Computer-go] AlphaGo & DCNN: Handling long-range dependency

2016-03-15 Thread Brian Sheppard
>So a small error in the opening or middle game can literally be worth anything >by the time the game ends. These are my estimates: human pros >= 24 points lost, and >= 5 game-losing errors against other human pros. I relayed my experience of a comparable experiment with chess, and how those e

Re: [Computer-go] AlphaGo & DCNN: Handling long-range dependency

2016-03-14 Thread Brian Sheppard
Maybe 10 years ago I extrapolated from the results of title matches to arrive at an estimate of perfect play. My memory is hazy about the conclusions (sorry), so I would love if someone is curious enough to build models and draw conclusions and share them. The basis of the estimate is that Blac

Re: [Computer-go] Game 4: a rare insight

2016-03-13 Thread Brian Sheppard
I have the impression that the value network is used to initialize the score of a node to, say, 70% out of N trials. Then the MCTS is trial N+1, N+2, etc. Still asymptotically optimal, but if the value network is accurate then you have a big acceleration in accuracy because the scores start from

Re: [Computer-go] Game 4: a rare insight

2016-03-13 Thread Brian Sheppard
I would not place too much confidence in the observers. Even though they are pro players, they don't have the same degree of concentration as the game's participants, and they have an obligation to speak about the game on a regular basis which further deteriorates analytic skills. Figuring out w

Re: [Computer-go] Congratulations to AlphaGo

2016-03-12 Thread Brian Sheppard
Play on KGS. Pros can be anonymous, and test themselves and AlphaGo at the same time. :-) From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Jim O'Flaherty Sent: Saturday, March 12, 2016 4:56 PM To: computer-go@computer-go.org Subject: Re: [Computer-go] Congratulation

Re: [Computer-go] Congratulations to AlphaGo

2016-03-12 Thread Brian Sheppard
Playing on KGS will give a good sense. Players have sustained ranks between 9 and 10 dan there, but never over 10 dan. If AlphaGo can sustain over 10 dan, then there is clear separation. -Original Message- From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Thoma

Re: [Computer-go] AlphaGo & DCNN: Handling long-range dependency

2016-03-11 Thread Brian Sheppard
Actually chess software is much, much better. I recall that today's software running on 1998 hardware beats 1998 software running on today's hardware. It was very soon after 1998 that ordinary PCs could play on a par with world champions. -Original Message- From: Computer-go [mailto:com

Re: [Computer-go] Finding Alphago's Weaknesses

2016-03-10 Thread Brian Sheppard
Amen to Don Dailey. He would be so proud. From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Jim O'Flaherty Sent: Thursday, March 10, 2016 6:49 PM To: computer-go@computer-go.org Subject: Re: [Computer-go] Finding Alphago's Weaknesses I think we are going to see a

Re: [Computer-go] Neural Net move prediction

2016-02-04 Thread Brian Sheppard
The method is not likely to work, since the goal of NN training s to reduce the residual error to a random function of the NN inputs. If the NN succeeds at this, then there will be no signal to train against. If the NN fails, then it could be because the NN is not large enough, or because there

Re: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search

2016-02-01 Thread Brian Sheppard
You play until neither player wishes to make a move. The players are willing to move on any point that is not self-atari, and they are willing to make self-atari plays if capture would result in a Nakade (http://senseis.xmp.net/?Nakade) This correctly plays seki. From: Computer-go [mail

Re: [Computer-go] A proposition to improve neural network based on min max

2016-01-30 Thread Brian Sheppard
Seems related to “temporal difference learning.” Each position’s max value should equal the next position’s min value in the asymptote, so you train the system accordingly. From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Xavier Combelle Sent: Saturday, January 30,

Re: [Computer-go] Game Over

2016-01-28 Thread Brian Sheppard
seem to work quite well. There is a paper by him on this topic in the proceedings of the conference Advances in Computer Games 2015 (in Leiden , NL), published by Springer Lecture Notes in Computer Science (LNCS). A very early application of truncated rollouts was applied by Brian Sheppard i

Re: [Computer-go] Report KGS Slow Tournament

2015-09-25 Thread Brian Sheppard
https://chessprogramming.wikispaces.com/Tobias+Graf there are also good research reports on his web site. From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Aja Huang Sent: Friday, September 25, 2015 1:42 PM To: computer-go@computer-go.org Subject: Re: [Computer-go

Re: [Computer-go] alternative for cgos

2015-01-13 Thread Brian Sheppard
Alas, the limitations of a hosted environment would make participating in Baduk.io a non-starter for me. It is much better to build a selection of standard bots and run tests on my own hardware. Which is what I have done in the absence of CGOS. IMO, running using one’s own hardware/OS/langua

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-23 Thread Brian Sheppard
A 3-layer network (input, hidden, output) is sufficient to be a universal function approximator, so from a theoretical perspective only 3 layers are necessary. But the gap between theoretical and practical is quite large. The CNN architecture builds in translation invariance and sensitivity t

Re: [Computer-go] Move Evaluation in Go Using Deep Convolutional Neural Networks

2014-12-22 Thread Brian Sheppard
I wondered that too, because the search tree frequently reaches positions by transpositions. Only testing would say for sure. And even then, YMMV. From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Stefan Kaitschick Sent: Monday, December 22, 2014 9:46 AM To: comp

Re: [Computer-go] Teaching Deep Convolutional Neural Networks to Play Go

2014-12-19 Thread Brian Sheppard
>This is Aya's move predictor(W) vs GNU Go(B). >http://eidogo.com/#3BNw8ez0R >I think previous move effect is too strong. This is a good example of why a good playout engine will not necessarily play well. The purpose of the playout policy is to *balance* errors. Following your opponent's last

Re: [Computer-go] Teaching Deep Convolutional Neural Networks to Play Go

2014-12-16 Thread Brian Sheppard
e paper and I hope you will enjoy reading it. Regards, Aja On Tue, Dec 16, 2014 at 2:03 PM, Brian Sheppard wrote: My impression is that each feature gets a single weight in Crazy-stone. The team-of-features aspect arises because a single point can match several patterns, so you need

Re: [Computer-go] Teaching Deep Convolutional Neural Networks to Play Go

2014-12-16 Thread Brian Sheppard
is directly to play? If the former, it had better be reasonably fast. The latter approach can be far slower, but requires the predictions to be of much higher quality. And the biggest question, how can you make these two approaches interact efficiently? René On Mon, Dec 15, 2014 at 8:0

Re: [Computer-go] Teaching Deep Convolutional Neural Networks to Play Go

2014-12-15 Thread Brian Sheppard
>Is it really such a burden? Well, I have to place my bets on some things and not on others. It seems to me that the costs of a NN must be higher than a system based on decision trees. The convolution NN has a very large parameter space if my reading of the paper is correct. Specifically,

Re: [Computer-go] Teaching Deep Convolutional Neural Networks to Play Go

2014-12-15 Thread Brian Sheppard
So I read this kind of study with some skepticism. My guess is that the "large-scale pattern" systems in use by leading programs are already pretty good for their purpose (i.e., progressive bias). Rampant personal and unverified speculation follows...

[computer-go] Paper about Mogo's Opening Strategy

2010-01-15 Thread Brian Sheppard
I recommend the paper http://hal.inria.fr/docs/00/36/97/83/PDF/ouvertures9x9.pdf by the Mogo team, which describes how to use a grid to compute Mogo's opening book using coevolution. Brian ___ computer-go mailing list computer-go@computer-go.org http://

[computer-go] Optimizing combinations of flags

2009-11-24 Thread Brian Sheppard
>What do you do when you add a new parameter? Do you retain your existing >'history', considering each game to have been played with the value of >the new parameter set to zero? Yes, exactly. >If you have 50 parameters already, doesn't adding a new parameter create >a rather large number of new p

[computer-go] Optimizing combinations of flags

2009-11-23 Thread Brian Sheppard
>Your system seems very interesting but it seems to me that you assume >that each parameters are independant. >What happen if, for example, two parameters works well when only one of >the is active and badly if the two are actives at the same time ? I think that I am assuming only that the objecti

[computer-go] Optimizing combinations of flags

2009-11-23 Thread Brian Sheppard
>From what I understand, for each parameter you take with some high >probability the best so far, and with some lower probability the least >tried one. This requires (manually) enumerating all parameters on some >integer scale, if I got it correctly. Yes, for each parameter you make a range of val

[computer-go] Optiizing combinations of flags

2009-11-20 Thread Brian Sheppard
>On that topic, I have around 17 flag who enable or not features in my >pure playouts bots, and I want to search the best combinations of them. >I known this is almost a dream but does anyone know the best way to >approximate this. Pebbles randomly chooses (using a zero asymptotic regret strategy)

[computer-go] A Tenure Track Position in Japan

2009-11-05 Thread Brian Sheppard
Man, I just don't live in the right location. First Paris, and now Japan. Can't a position open in, say, Boston? :-( ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/

[computer-go] MPI vs Thread-safe

2009-11-01 Thread Brian Sheppard
>The PRNG (pseudo random number generator) can cause the super-linear >speed-up, for example. Really? The serial simulation argument seems to apply. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/co

[computer-go] MPI vs Thread-safe

2009-10-30 Thread Brian Sheppard
Back of envelope calculation: MFG processes 5K nodes/sec/core * 4 cores per process = 20K nodes/sec/process. Four processes makes 80K nodes/sec. If you think for 30 seconds (pondering + move time) then you are at 2.4 million nodes. Figure about 25,000 nodes having 100 visits or more. UCT data is ro

[computer-go] MPI vs Thread-safe

2009-10-30 Thread Brian Sheppard
>> Parallelization *cannot* provide super-linear speed-up. > >I don't see that at all. This is standard computer science stuff, true of all parallel programs and not just Go players. No parallel program can be better than N times a serial version. The result follows from a simulation argument. Su

[computer-go] Reservations about Parallelization Strategy

2009-10-30 Thread Brian Sheppard
While re-reading the parallelization papers, I tried to formulate why I thought that they couldn't be right. The issue with Mango reporting a super-linear speed-up was an obvious red flag, but that doesn't mean that their conclusions were wrong. It just means that Mango's exploration policy needs t

[computer-go] MPI vs Thread-safe

2009-10-30 Thread Brian Sheppard
>I only share UCT wins and visits, and the MPI version only >scales well to 4 nodes. The scalability limit seems very low. Just curious: what is the policy for deciding what to synchronize? I recall that MoGo shared only the root node. ___ computer-go

[computer-go] MPI vs Thread-safe

2009-10-30 Thread Brian Sheppard
>I personally just use root parallelization in Pachi I think this answers my question; each core in Pachi independently explores a tree, and the master thread merges the data. This is even though you have shared memory on your machine. >Have you read the Parallel Monte-Carlo Tree Search paper?

[computer-go] MPI vs Thread-safe

2009-10-29 Thread Brian Sheppard
I have a question for those who have parallelized programs. It seems like MPI is the obvious architecture when scaling a program to multiple machines. Let's assume that we implement a program that has that capability. Now, it is possible to use MPI for scaling *within* a compute node. For example

[computer-go] NVidia Fermi and go?

2009-10-23 Thread Brian Sheppard
>> BTW, it occurs to me that we can approximate the efficiency of >> parallelization by taking execution counts from a profiler and >> post-processing them. I should do that before buying a new GPU. :-) >I wonder what you mean by that. If you run your program on a sequential machine and count sta

[computer-go] NVidia Fermi and go?

2009-10-23 Thread Brian Sheppard
>In my own gpu experiment (light playouts), registers/memory were the >bounding factors on simulation speed. I respect your experimental finding, but I note that you have carefully specified "light playouts," probably because you suspect that there may be a significant difference if playouts are h

  1   2   >