I read through that paper, but I admit that I didn't really get where the extra
power comes from.
-Original Message-
From: valkyria
To: Computer-go
Sent: Mon, Nov 25, 2019 6:41 am
Subject: [Computer-go] MuZero - new paper from DeepMind
Hi,
if anyone still get email from this list:
D
I remember a scheme (from Dave Dyer, IIRC) that indexed positions based on the
points on which the 20th, 40th, 60th,... moves were made. IIRC it was nearly a
unique key for pro positions.
Best,Brian
-Original Message-
From: Erik van der Werf
To: computer-go
Sent: Tue, Sep 17, 2019 5:5
>> contrary to intuition built up from earlier-generation MCTS programs in Go,
>> putting significant weight on score maximization rather than only
>> win/loss seems to help.
This narrative glosses over important nuances.
Collectively we are trying to find the golden mean of cost efficiency...
Thanks for the explanation. I agree that there is no actual consistency in
exploration terms across historical papers.
I confirmed that the PUCT formulas across the AG, AGZ, and AZ papers are all
consistent. That is unlikely to be an error. So now I am wondering whether the
faster decay is usef
Subject: Re: [Computer-go] PUCT formula
On 08-03-18 18:47, Brian Sheppard via Computer-go wrote:
> I recall that someone investigated this question, but I don’t recall
> the result. What is the formula that AGZ actually uses?
The one mentioned in their paper, I assume.
I investigated both
In the AGZ paper, there is a formula for what they call “a variant of the PUCT
algorithm”, and they cite a paper from Christopher Rosin:
http://gauss.ececs.uc.edu/Workshops/isaim2010/papers/rosin.pdf
But that paper has a formula that he calls the PUCB formula, which incorporates
the priors i
The technique originated with backgammon players in the late 1970's, who would
roll out positions manually. Ron Tiekert (Scrabble champion) also applied the
technique to Scrabble, and I took that idea for Maven. It seemed like people
were using the terms interchangeably.
-Original Message--
(as well as many others) with MCTS for
tactical dominated games is bad, and there must be some breakthrough in that
regard in AlphaZero
for it to be able to compete with Stockfish on a tactical level.
I am curious how Remi's attempt at Shogi using AlphaZero's method will turnout.
On Tue, Mar 6, 2018 at 9:41 AM, Brian Sheppard via Computer-go
mailto:computer-go@computer-go.org> > wrote:
Training on Stockfish games is guaranteed to produce a blunder-fest, because
there are no blunders in the training set and therefore the policy network
never learn
Training on Stockfish games is guaranteed to produce a blunder-fest, because
there are no blunders in the training set and therefore the policy network
never learns how to refute blunders.
This is not a flaw in MCTS, but rather in the policy network. MCTS will
eventually search every move in
Seems like extraordinarily fast progress. Great to hear that.
-Original Message-
From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of
"Ingo Althöfer"
Sent: Friday, December 29, 2017 12:30 PM
To: computer-go@computer-go.org
Subject: [Computer-go] Project Leela Zero
I agree that having special knowledge for "pass" is not a big compromise, but
it would not meet the "zero knowledge" goal, no?
-Original Message-
From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of
Rémi Coulom
Sent: Friday, December 29, 2017 7:50 AM
To: computer-g
>I wouldn't find it so surprising if eventually the 20 or 40 block networks
>develop a set of convolutional channels that traces possible ladders
>diagonally across the board.
Learning the deep tactics is more-or-less guaranteed because of the interaction
between search and evaluation throug
Agreed.
You can push this farther. If we define an “error” as a move that flips the W/L
state of a Go game, then only the side that is currently winning can make an
error. Let’s suppose that 6.5 komi is winning for Black. Then Black can make an
error, and after he does then White can make an
AZ scalability looks good in that diagram, and it is certainly a good start,
but it only goes out through 10 sec/move. Also, if the hardware is 7x better
for AZ than SF, then should we elongate the curve for AZ by 7x? Or compress the
curve for SF by 7x? Or some combination? Or take the data at f
@computer-go.org] On Behalf Of
Gian-Carlo Pascutto
Sent: Thursday, December 7, 2017 8:17 AM
To: computer-go@computer-go.org
Subject: Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a
General Reinforcement Learning Algorithm
On 7/12/2017 13:20, Brian Sheppard via Computer-go wro
o Pascutto
Sent: Thursday, December 7, 2017 4:13 AM
To: computer-go@computer-go.org
Subject: Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a
General Reinforcement Learning Algorithm
On 06-12-17 22:29, Brian Sheppard via Computer-go wrote:
> The chess result is 64-36: a 100 ra
he result
> was statistically significant > 0. (+35 Elo over 400 games...
Brian Sheppard also:
> Requiring a margin > 55% is a defense against a random result. A 55%
> score in a 400-game match is 2 sigma.
Good point. That makes sense.
But (where A is best so far, and B is the newer netw
Requiring a margin > 55% is a defense against a random result. A 55% score in a
400-game match is 2 sigma.
But I like the AZ policy better, because it does not require arbitrary
parameters. It also improves more fluidly by always drawing training examples
from the current probability distributi
The chess result is 64-36: a 100 rating point edge! I think the Stockfish open
source project improved Stockfish by ~20 rating points in the last year. Given
the number of people/computers involved, Stockfish’s annual effort level seems
comparable to the AZ effort.
Stockfish is really, reall
, or is it imitating the best known
algorithm inconvenient for your purposes?
Best,
-Chaz
On Sat, Dec 2, 2017 at 7:31 PM, Brian Sheppard via Computer-go
mailto:computer-go@computer-go.org> > wrote:
I implemented the ad hoc rule of not training on positions after the first
pass, and my prog
not to resign to early
(even before not passing)
Le 02/12/2017 à 18:17, Brian Sheppard via Computer-go a écrit :
I have some hard data now. My network’s initial training reached the same
performance in half the iterations. That is, the steepness of skill gain in the
first day of training was
, YMMV, etc.
From: Brian Sheppard [mailto:sheppar...@aol.com]
Sent: Friday, December 1, 2017 5:39 PM
To: 'computer-go'
Subject: RE: [Computer-go] Significance of resignation in AGZ
I didn’t measure precisely because as soon as I saw the training artifacts I
changed the code. An
of its capacity to
improve accuracy on positions that will not contribute to its ultimate
strength. This applies to both ordering and evaluation aspects.
From: Andy [mailto:andy.olsen...@gmail.com]
Sent: Friday, December 1, 2017 4:55 PM
To: Brian Sheppard ; computer-go
Subject:
I have concluded that AGZ's policy of resigning "lost" games early is somewhat
significant. Not as significant as using residual networks, for sure, but you
wouldn't want to go without these advantages.
The benefit cited in the paper is speed. Certainly a factor. I see two other
advantages.
Fi
State of the art in computer chess is alpha-beta search, but note that the
search is very selective because of "late move reductions."
A late move reduction is to reduce depth for moves after the first move
generated in a node. For example, a simple implementation would be "search the
first mov
I would add that "wild guesses based on not enough info" is an indispensable
skill.
-Original Message-
From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of
Hideki Kato
Sent: Thursday, October 26, 2017 10:17 AM
To: computer-go@computer-go.org
Subject: Re: [Computer-
g] On Behalf Of
Robert Jasiek
Sent: Thursday, October 26, 2017 10:17 AM
To: computer-go@computer-go.org
Subject: Re: [Computer-go] AlphaGo Zero SGF - Free Use or Copyright?
On 26.10.2017 13:52, Brian Sheppard via Computer-go wrote:
> MCTS is the glue that binds incompatible rules.
This is, h
Robert is right, but Robert seems to think this hasn't been done. Actually
every prominent non-neural MCTS program since Mogo has been based on the exact
design that Robert describes. The best of them achieve somewhat greater
strength than Robert expects.
MCTS is the glue that binds incompatibl
I think it uses the champion network. That is, the training periodically
generates a candidate, and there is a playoff against the current champion. If
the candidate wins by more than 55% then a new champion is declared.
Keeping a champion is an important mechanism, I believe. That creates th
So I am reading that residual networks are simply better than normal
convolutional networks. There is a detailed write-up here:
https://blog.waya.ai/deep-residual-learning-9610bb62c355
Summary: the residual network has a fixed connection that adds (with no
scaling) the output of the previous le
o.org] On Behalf Of
Gian-Carlo Pascutto
Sent: Wednesday, October 18, 2017 5:40 PM
To: computer-go@computer-go.org
Subject: Re: [Computer-go] AlphaGo Zero
On 18/10/2017 22:00, Brian Sheppard via Computer-go wrote:
> This paper is required reading. When I read this team’s papers, I
> think to my
Some thoughts toward the idea of general game-playing...
One aspect of Go is ideally suited for visual NN: strong locality of reference.
That is, stones affect stones that are nearby.
I wonder whether the late emergence of ladder understanding within AlphaGo Zero
is an artifact of the board re
ednesday, October 18, 2017 4:38 PM
To: computer-go@computer-go.org
Subject: Re: [Computer-go] AlphaGo Zero
On 18/10/2017 22:00, Brian Sheppard via Computer-go wrote:
> A stunning result. The NN uses a standard vision architecture (no Go
> adaptation beyond what is necessary to represent the ga
This paper is required reading. When I read this team’s papers, I think to
myself “Wow, this is brilliant! And I think I see the next step.” When I read
their next paper, they show me the next *three* steps. I can’t say enough good
things about the quality of the work.
A stunning result. The
PM
To: Brian Sheppard
Cc: computer-go
Subject: Re: [Computer-go] Alphago and solving Go
This is semantics. Yes, in the limit of infinite time, it is brute-force.
Meanwhile, in the real world, AlphaGo chooses to balance its finite time budget
between depth & width. The mere fact that the
[mailto:lightvec...@gmail.com]
Sent: Sunday, August 6, 2017 2:54 PM
To: Brian Sheppard ; computer-go@computer-go.org
Cc: Steven Clark
Subject: Re: [Computer-go] Alphago and solving Go
Saying in an unqualified way that AlphaGo is brute force is wrong in the spirit
of the question. Assuming AlphaGo uses a
There is a definition of “brute force” on Wikipedia. Seems to me that MCTS
fits. Deep Blue fits.
From: Álvaro Begué [mailto:alvaro.be...@gmail.com]
Sent: Sunday, August 6, 2017 2:56 PM
To: Brian Sheppard ; computer-go
Subject: Re: [Computer-go] Alphago and solving Go
It was already a
possible in each node. It is in this respect that AlphaGo
truly excels. (AlphaGo also excels at whole board evaluation, but that is a
separate topic.)
From: Steven Clark [mailto:steven.p.cl...@gmail.com]
Sent: Sunday, August 6, 2017 1:14 PM
To: Brian Sheppard ; computer-go
Subject: Re
Yes, AlphaGo is brute force.
No it is impossible to solve Go.
Perfect play looks a lot like AlphaGo in that you would not be able to tell the
difference. But I think that AlphaGo still has 0% win rate against perfect play.
My own best guess is that top humans make about 12 errors per game. T
>I haven't tried it, but (with the computer chess hat on) these kind of
>proposals behave pretty badly when you get into situations where your
>evaluation is off and there are horizon effects.
In computer Go, this issue focuses on cases where the initial move ordering is
bad. It isn't so much e
Yes. This is a long-known phenomenon.
I was able to get improvements in Pebbles based on the idea of forgetting
unsuccessful results. It has to be done somewhat carefully, because results
propagate up the tree. But you can definitely make it work.
I recall a paper published on this basis.
>... my value network was trained to tell me the game is balanced at the
>beginning...
:-)
The best training policy is to select positions that correct errors.
I used the policies below to train a backgammon NN. Together, they reduced the
expected loss of the network by 50% (cut the error rate
uter-go.org] On Behalf Of
Gian-Carlo Pascutto
Sent: Monday, May 22, 2017 4:08 AM
To: computer-go@computer-go.org
Subject: Re: [Computer-go] mini-max with Policy and Value network
On 20/05/2017 22:26, Brian Sheppard via Computer-go wrote:
> Could use late-move reductions to eliminate the hard pruning
Could use late-move reductions to eliminate the hard pruning. Given the
accuracy rate of the policy network, I would guess that even move 2 should be
reduced.
-Original Message-
From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of
Hiroshi Yamashita
Sent: Saturday,
herty...@gmail.com]
Sent: Tuesday, April 18, 2017 9:06 AM
To: computer-go@computer-go.org; Brian Sheppard
Subject: Re: [Computer-go] Patterns and bad shape
Now, I love this idea. A super fast cheap pattern matcher can act as input into
a neural network input layer in sort of a "pay additional
Adding patterns is very cheap: encode the patterns as an if/else tree, and it
is O(log n) to match.
Pattern matching as such did not show up as a significant component of Pebbles.
But that is mostly because all of the machinery that makes pattern-matching
cheap (incremental updating of 3x3 n
networks have
always been surprisingly successful at Go. Like Brugmann’s paper about Monte
Carlo, which was underestimated for a long time. Sigh.
Best,
Brian
From: Minjae Kim [mailto:xive...@gmail.com]
Sent: Friday, February 24, 2017 11:56 AM
To: Brian Sheppard ; computer-go@computer
Neural networks always have a lot of local optima. Simply because they have a
high degree of internal symmetry. That is, you can “permute” sets of
coefficients and get the same function.
Don’t think of starting with expert training as a way to avoid local optima. It
is a way to start trainin
If your database is composed of self-play games, then the likelihood
maximization policy should gain strength rapidly, and there should be a way to
have asymptotic optimality. (That is, the patterns alone will play a perfect
game in the limit.)
Specifically: play self-play games using an asy
be able to
improve the rollouts at some point.
Roel
On 31 January 2017 at 17:21, Brian Sheppard via Computer-go
wrote:
If a "diamond" pattern is centered on a 5x5 square, then you have 13 points.
The diagram below will give the idea.
__+__
_+++_
+
_+++_
__+__
At o
identify all
potential nakade." seems to indicate this is a known pattern set? could i find
some more information on it somewhere? also i was unable to find Pebbles, is it
open source?
@Robert Jasiek
what definitions/papers/publications are you referring to?
m.v.g. Roel
On 24 Januar
r-go@computer-go.org
Subject: Re: [Computer-go] AlphaGo rollout nakade patterns?
On 23-01-17 20:10, Brian Sheppard via Computer-go wrote:
> only captures of up to 9 stones can be nakade.
I don't really understand this.
http://senseis.xmp.net/?StraightThree
Both constructing this shape and pl
A capturing move has a potential nakade if the string that was removed is among
a limited set of possibilities. Probably Alpha Go has a 13-point bounding
region (e.g., the 13-point star) that it uses as a positional index, and
therefore a 8192-sized pattern set will identify all potential nakad
I was writing code along those lines when AlphaGo debuted. When it became clear
that AlphaGo had succeeded, then I ceased work.
So I don’t know whether this strategy will succeed, but the theoretical merits
were good enough to encourage me.
Best of luck,
Brian
From: Computer-go [mail
The downside to training the moves of the winner is that you "throw away" half
of the data. But that isn't so bad, because you can always get as much data as
you need.
The upside is that losing positions do not necessarily have a training signal.
That is: there is no good move in a losing posit
IMO, training using only the moves of winners is obviously the practical choice.
Worst case: you "waste" half of your data. But that is actually not a downside
provided that you have lots of data, and as your program strengthens you will
avoid potential data-quality problems.
Asymptotically, yo
The learning rate seems much too high. My experience (which is from backgammon
rather than Go, among other caveats) is that you need tiny learning rates.
Tiny, as in 1/TrainingSetSize.
Neural networks are dark magic. Be prepared to spend many weeks just trying to
figure things out. You can b
Are you the Johannes Laire who works at Cimpress in WTR?
-Original Message-
From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of
Johannes Laire
Sent: Friday, May 27, 2016 4:45 PM
To: computer-go@computer-go.org
Subject: Re: [Computer-go] Crazystone on Steam
The 201
Ingo Althöfer"
Sent: Thursday, March 31, 2016 4:31 AM
To: computer-go@computer-go.org
Subject: Re: [Computer-go] new challenge for Go programmers
Hello all,
"Brian Sheppard" wrote:
> ... This is out of line, IMO. Djhbrown asked a sensible question that
> has valuab
This is out of line, IMO. Djhbrown asked a sensible question that has valuable
intentions. I would like to see responsible, thoughtful, and constructive
replies.
From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of
uurtamo .
Sent: Wednesday, March 30, 2016 7:43 PM
To:
Trouble is that it is very difficult to put certain concepts into mathematics.
For instance: “well, I tried to find parameters that did a better job of
minimizing that error function, but eventually I lost patience.” :-)
Neural network parameters are not directly humanly understandable. They
Hmm, seems to imply a 1000-Elo edge over human 9p. But such a player would
literally never lose a game to a human.
I take this as an example of the difficulty of extrapolating based on games
against computers. (and the slide seems to have a disclaimer to this effect, if
I am reading the text on
ches to contend for championship caliber.
-Original Message-----
From: Brian Sheppard [mailto:sheppar...@aol.com]
Sent: Tuesday, March 15, 2016 6:20 PM
To: 'computer-go@computer-go.org'
Subject: RE: [Computer-go] AlphaGo & DCNN: Handling long-range dependency
>So a small err
>So a small error in the opening or middle game can literally be worth anything
>by the time the game ends.
These are my estimates: human pros >= 24 points lost, and >= 5 game-losing
errors against other human pros.
I relayed my experience of a comparable experiment with chess, and how those
e
Maybe 10 years ago I extrapolated from the results of title matches to arrive
at an estimate of perfect play. My memory is hazy about the conclusions
(sorry), so I would love if someone is curious enough to build models and draw
conclusions and share them.
The basis of the estimate is that Blac
I have the impression that the value network is used to initialize the score of
a node to, say, 70% out of N trials. Then the MCTS is trial N+1, N+2, etc.
Still asymptotically optimal, but if the value network is accurate then you
have a big acceleration in accuracy because the scores start from
I would not place too much confidence in the observers. Even though they are
pro players, they don't have the same degree of concentration as the game's
participants, and they have an obligation to speak about the game on a regular
basis which further deteriorates analytic skills. Figuring out w
Play on KGS. Pros can be anonymous, and test themselves and AlphaGo at the same
time. :-)
From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Jim
O'Flaherty
Sent: Saturday, March 12, 2016 4:56 PM
To: computer-go@computer-go.org
Subject: Re: [Computer-go] Congratulation
Playing on KGS will give a good sense. Players have sustained ranks between 9
and 10 dan there, but never over 10 dan. If AlphaGo can sustain over 10 dan,
then there is clear separation.
-Original Message-
From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of
Thoma
Actually chess software is much, much better. I recall that today's software
running on 1998 hardware beats 1998 software running on today's hardware.
It was very soon after 1998 that ordinary PCs could play on a par with world
champions.
-Original Message-
From: Computer-go [mailto:com
Amen to Don Dailey. He would be so proud.
From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Jim
O'Flaherty
Sent: Thursday, March 10, 2016 6:49 PM
To: computer-go@computer-go.org
Subject: Re: [Computer-go] Finding Alphago's Weaknesses
I think we are going to see a
The method is not likely to work, since the goal of NN training s to reduce the
residual error to a random function of the NN inputs. If the NN succeeds at
this, then there will be no signal to train against. If the NN fails, then it
could be because the NN is not large enough, or because there
You play until neither player wishes to make a move. The players are willing to
move on any point that is not self-atari, and they are willing to make
self-atari plays if capture would result in a Nakade
(http://senseis.xmp.net/?Nakade)
This correctly plays seki.
From: Computer-go [mail
Seems related to “temporal difference learning.” Each position’s max value
should equal the next position’s min value in the asymptote, so you train the
system accordingly.
From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of
Xavier Combelle
Sent: Saturday, January 30,
seem to work quite well.
There is a paper by him on this topic in the proceedings of the conference
Advances in Computer Games 2015 (in Leiden , NL), published by Springer
Lecture Notes in Computer Science (LNCS).
A very early application of truncated rollouts was applied by Brian
Sheppard i
https://chessprogramming.wikispaces.com/Tobias+Graf
there are also good research reports on his web site.
From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Aja
Huang
Sent: Friday, September 25, 2015 1:42 PM
To: computer-go@computer-go.org
Subject: Re: [Computer-go
Alas, the limitations of a hosted environment would make participating in
Baduk.io a non-starter for me. It is much better to build a selection of
standard bots and run tests on my own hardware. Which is what I have done in
the absence of CGOS.
IMO, running using one’s own hardware/OS/langua
A 3-layer network (input, hidden, output) is sufficient to be a universal
function approximator, so from a theoretical perspective only 3 layers are
necessary. But the gap between theoretical and practical is quite large.
The CNN architecture builds in translation invariance and sensitivity t
I wondered that too, because the search tree frequently reaches positions by
transpositions.
Only testing would say for sure. And even then, YMMV.
From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of
Stefan Kaitschick
Sent: Monday, December 22, 2014 9:46 AM
To: comp
>This is Aya's move predictor(W) vs GNU Go(B).
>http://eidogo.com/#3BNw8ez0R
>I think previous move effect is too strong.
This is a good example of why a good playout engine will not necessarily play
well. The purpose of the playout policy is to *balance* errors. Following your
opponent's last
e paper and I hope you will enjoy
reading it.
Regards,
Aja
On Tue, Dec 16, 2014 at 2:03 PM, Brian Sheppard wrote:
My impression is that each feature gets a single weight in Crazy-stone. The
team-of-features aspect arises because a single point can match several
patterns, so you need
is directly to
play? If the former, it had better be reasonably fast. The latter approach can
be far slower, but requires the predictions to be of much higher quality. And
the biggest question, how can you make these two approaches interact
efficiently?
René
On Mon, Dec 15, 2014 at 8:0
>Is it really such a burden?
Well, I have to place my bets on some things and not on others.
It seems to me that the costs of a NN must be higher than a system based on
decision trees. The convolution NN has a very large parameter space if my
reading of the paper is correct. Specifically,
So I read this kind of study with some skepticism. My guess is that the
"large-scale pattern" systems in use by leading programs are already pretty
good for their purpose (i.e., progressive bias).
Rampant personal and unverified speculation follows...
I recommend the paper
http://hal.inria.fr/docs/00/36/97/83/PDF/ouvertures9x9.pdf by the Mogo team,
which describes how to use a grid to compute Mogo's opening book using
coevolution.
Brian
___
computer-go mailing list
computer-go@computer-go.org
http://
>What do you do when you add a new parameter? Do you retain your existing
>'history', considering each game to have been played with the value of
>the new parameter set to zero?
Yes, exactly.
>If you have 50 parameters already, doesn't adding a new parameter create
>a rather large number of new p
>Your system seems very interesting but it seems to me that you assume
>that each parameters are independant.
>What happen if, for example, two parameters works well when only one of
>the is active and badly if the two are actives at the same time ?
I think that I am assuming only that the objecti
>From what I understand, for each parameter you take with some high
>probability the best so far, and with some lower probability the least
>tried one. This requires (manually) enumerating all parameters on some
>integer scale, if I got it correctly.
Yes, for each parameter you make a range of val
>On that topic, I have around 17 flag who enable or not features in my
>pure playouts bots, and I want to search the best combinations of them.
>I known this is almost a dream but does anyone know the best way to
>approximate this.
Pebbles randomly chooses (using a zero asymptotic regret strategy)
Man, I just don't live in the right location. First Paris, and now Japan.
Can't a position open in, say, Boston? :-(
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/
>The PRNG (pseudo random number generator) can cause the super-linear
>speed-up, for example.
Really? The serial simulation argument seems to apply.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/co
Back of envelope calculation: MFG processes 5K nodes/sec/core * 4 cores per
process = 20K nodes/sec/process. Four processes makes 80K nodes/sec. If you
think for 30 seconds (pondering + move time) then you are at 2.4 million
nodes. Figure about 25,000 nodes having 100 visits or more. UCT data is
ro
>> Parallelization *cannot* provide super-linear speed-up.
>
>I don't see that at all.
This is standard computer science stuff, true of all parallel programs and
not just Go players. No parallel program can be better than N times a serial
version.
The result follows from a simulation argument. Su
While re-reading the parallelization papers, I tried to formulate why I
thought that they couldn't be right. The issue with Mango reporting a
super-linear speed-up was an obvious red flag, but that doesn't mean that
their conclusions were wrong. It just means that Mango's exploration policy
needs t
>I only share UCT wins and visits, and the MPI version only
>scales well to 4 nodes.
The scalability limit seems very low.
Just curious: what is the policy for deciding what to synchronize? I recall
that MoGo shared only the root node.
___
computer-go
>I personally just use root parallelization in Pachi
I think this answers my question; each core in Pachi independently explores
a tree, and the master thread merges the data. This is even though you have
shared memory on your machine.
>Have you read the Parallel Monte-Carlo Tree Search paper?
I have a question for those who have parallelized programs.
It seems like MPI is the obvious architecture when scaling a program to
multiple machines. Let's assume that we implement a program that has that
capability.
Now, it is possible to use MPI for scaling *within* a compute node. For
example
>> BTW, it occurs to me that we can approximate the efficiency of
>> parallelization by taking execution counts from a profiler and
>> post-processing them. I should do that before buying a new GPU. :-)
>I wonder what you mean by that.
If you run your program on a sequential machine and count sta
>In my own gpu experiment (light playouts), registers/memory were the
>bounding factors on simulation speed.
I respect your experimental finding, but I note that you have carefully
specified "light playouts," probably because you suspect that there may be a
significant difference if playouts are h
1 - 100 of 179 matches
Mail list logo