No hurry, I will be away for a week next week. :-)
Isaac,
I haven't looked at this stuff for a while. I'm not at home so I can't
look at it now.
From the error I understand that tesuji/games/general/MoveIterator is
missing. It is there in the Subversion source-tree however. So I don't
On Sat, Jan 17, 2009 at 08:29:32PM +0100, Sylvain Gelly wrote:
A small point: in PlayoutOutTree, just after if
(!played.AlreadyPlayed(move)) {, there should have a played.Play(move).
I believe it does not change the final result (as the check is also done in
the backup, and the move played in
I had a little spare time to look at it. It seems indeed I forgot to
update the GoEngineGTP.jar file last time I made some changes. This
was easy to fix even from here and I think it should work now.
Just as a note, if you want to change the number of playouts to 50K,
you need to change
Boon tesujisoftw...@gmail.com
An: computer-go computer-go@computer-go.org
Betreff: Re: [computer-go] How to properly implement RAVE?
I had a little spare time to look at it. It seems indeed I forgot to
update the GoEngineGTP.jar file last time I made some changes. This
was easy to fix even
On Fri, 2009-02-06 at 18:55 +0100, Isaac Deutsch wrote:
The rating of the bot still seems to be drifting upwards, but I think I can
conclude my UCT implementation is OK afterall. Many thanks to the bots
provided. Does someone have a bot that does 50k light playouts + RAVE? I
would be most
On Sun, Feb 8, 2009 at 3:02 PM, Isaac Deutsch i...@gmx.ch wrote:
Hi,
Can you explain what minimumNrNodes and nrSimulations do? In my program I
play 50k games regardless of the number of nodes, so I would like to adjust
this accordingly.
minimumNrNodes is the number of games played out.
I'm currently tied up but you can get my MCTS implementation, which
includes RAVE, and set it up to play 50K playouts. It's only a matter
of setting the right number in the configuration file.
You can also use it to play through two-gtp, that way you can test an
awful lot faster.
Mark
@computer-go.org
Betreff: Re: [computer-go] How to properly implement RAVE?
You can get everything here:
http://plug-and-go.dev.java.net
The MCTS program description is under 'Derived Projects'.
You don't really need the source-code. You can just get the 'binaries
scripts' and then copy
/MoveIterator
Original-Nachricht
Datum: Sat, 7 Feb 2009 09:30:53 -0200
Von: Mark Boon tesujisoftw...@gmail.com
An: computer-go computer-go@computer-go.org
Betreff: Re: [computer-go] How to properly implement RAVE?
You can get everything here:
http://plug
The rating of the bot still seems to be drifting upwards, but I think I can
conclude my UCT implementation is OK afterall. Many thanks to the bots
provided. Does someone have a bot that does 50k light playouts + RAVE? I
would be most grateful if you could put them online for a few days of
testing.
On Feb 6, 2009, at 9:55 AM, Isaac Deutsch wrote:
By the way, I've seen 2 games when checking my bot's status where
one of the
myCtest bots lost because of an illegal ko move. Maybe there's a
bug in
handling superko?
Not a bug, I never implemented it :-(
Christoph
On Feb 6, 2009, at 12:55 PM, Isaac Deutsch i...@gmx.ch wrote:
The rating of the bot still seems to be drifting upwards, but I
think I can
conclude my UCT implementation is OK afterall. Many thanks to the bots
provided. Does someone have a bot that does 50k light playouts +
RAVE? I
would be
I'm currently tied up but you can get my MCTS implementation, which
includes RAVE, and set it up to play 50K playouts. It's only a matter
of setting the right number in the configuration file.
You can also use it to play through two-gtp, that way you can test an
awful lot faster.
Mark
On Fri,
Thanks Christoph.
I've changed my bot to play 50k games. I know the results aren't very
statistically meaningful yet, but the 2 bots look close to each other
already. ;-) I will let it run for some more days if I can.
http://img145.imageshack.us/img145/3586/bild3bk3.png
-ibd
I have started
I actually tried leaf parallelization first, but after reading the
mentioned paper I switched to an implementation of root parallelization (as
described). I'm not sure if I implemented it correctly (like in my
description), but after testing a 2-core-version against a single-
core-version with a
On Sun, Feb 1, 2009 at 4:46 PM, Isaac Deutsch i...@gmx.ch wrote:
By the way, I got about 75 ELO points (1650-1720) with light playouts out
of RAVE. Do you think this is in the expected range? It's not really similar
to the 20%-60% win rate rise vs. GnuGo described in some papers...
My bot
On Tue, Feb 3, 2009 at 1:53 PM, Isaac Deutsch i...@gmx.ch wrote:
Hi Jason,
Thanks for your numbers. I might try to limit my bot to 50k playouts and 1
core, but I usually simulate as long as time permits.
That kind of setup should make it easier to compare. There have been a few
times in
Hi Jason,
Thanks for your numbers. I might try to limit my bot to 50k playouts and 1
core, but I usually simulate as long as time permits. Do you suspect my
pure UCT version has bugs, too, judging from its rating? I find it hard to
find good tests for the correctness of a program depending on
On Feb 2, 2009, at 6:57 AM, Isaac Deutsch i...@gmx.ch wrote:
Hi Issac,
You should be more in the range of +200-300 ELO, at least with
pattern
based
playouts.
Sylvain
Isaac. They are not pattern based playouts, but as I said uniformly
random.
I reckon the effect of RAVE is less with
On Feb 2, 2009, at 9:40 AM, Jason House jason.james.ho...@gmail.com
wrote:
On Feb 2, 2009, at 6:57 AM, Isaac Deutsch i...@gmx.ch wrote:
Hi Issac,
You should be more in the range of +200-300 ELO, at least with
pattern
based
playouts.
Sylvain
Isaac. They are not pattern based
Wow, thanks for all the answers! You're being really helpful.
Do you use UCT with a too large exploration term?
That's a good idea. I actually use a rather big value for c=0.5. I might try
lowering it. Thanks! (Precisely, the formula is c*sqrt(log(p)/c).)
My first (braindead) multi-threaded UCT
On Mon, 2009-02-02 at 18:09 +0100, Isaac Deutsch wrote:
I don't think many people realize that you have to play hundreds of
games just to be within 40 or 50 ELO with much certainty. If you
play
less than 100 games you could easily be off by over 100 ELO.
Maybe I'm a bit (a lot :)
I have about 200-300k games/move, so maybe the effect is even less. But,
maybe I still have a grave bug somewhere. I will check again.
Cheers, ibd
On Feb 2, 2009, at 9:57 AM, Isaac Deutsch wrote:
They are not pattern based playouts, but as I said uniformly random.
I reckon the effect of
On Mon, 2009-02-02 at 09:40 -0500, Jason House wrote:
Also, I noticed your rank measurements were based on CGOS results
after relatively few games. It can retain significant bias for quite
a
while.
Yes, and you should go by the bayeselo page which is a better picture of
what is going on.
On Feb 2, 2009, at 12:09 PM, Isaac Deutsch i...@gmx.ch wrote:
Wow, thanks for all the answers! You're being really helpful.
Do you use UCT with a too large exploration term?
That's a good idea. I actually use a rather big value for c=0.5. I
might try
lowering it. Thanks! (Precisely, the
On Feb 2, 2009, at 9:57 AM, Isaac Deutsch wrote:
They are not pattern based playouts, but as I said uniformly random.
I reckon the effect of RAVE is less with these?
My MCTS implementation sees a gain of 200-400 ELO from RAVE using
uniformly random moves. 400 gain (90% win-rate) for 2K
Hi Issac,
You should be more in the range of +200-300 ELO, at least with pattern
based
playouts.
Sylvain
Isaac. They are not pattern based playouts, but as I said uniformly random.
I reckon the effect of RAVE is less with these?
How many playouts per second do you get with each version?
My first (braindead) multi-threaded UCT played weaker with two cores
than one core. How do you combine search trees/results? How do you pick
a move to play?
There were a couple of papers [2] at CG2008 on this subject. The
consensus seemed to be that root parallelization [1] was best. In fact
By the way, I got about 75 ELO points (1650-1720) with light playouts out of
RAVE. Do you think this is in the expected range? It's not really similar to
the 20%-60% win rate rise vs. GnuGo described in some papers...
At the moment I'm also tuning the bias in the range 0.001-0.1. Given
How many playouts per second do you get with each version?
Sent from my iPhone
On Feb 1, 2009, at 4:46 PM, Isaac Deutsch i...@gmx.ch wrote:
By the way, I got about 75 ELO points (1650-1720) with light
playouts out of RAVE. Do you think this is in the expected range?
It's not really similar
Quoting Sylvain Gelly sylvain.ge...@m4x.org:
On Wed, Jan 28, 2009 at 10:01 PM, Isaac Deutsch i...@gmx.ch wrote:
And a final question: You calculate the (beta) coefficient as
c = rc / (rc+c+rc*c*BIAS);
which looks similar to the formula proposed by David Silver (If I recall
his name
On Fri, Jan 30, 2009 at 9:29 AM, Magnus Persson magnus.pers...@phmp.sewrote:
Quoting Sylvain Gelly sylvain.ge...@m4x.org:
On Wed, Jan 28, 2009 at 10:01 PM, Isaac Deutsch i...@gmx.ch wrote:
And a final question: You calculate the (beta) coefficient as
c = rc / (rc+c+rc*c*BIAS);
which
Quoting Sylvain Gelly sylvain.ge...@m4x.org:
On Fri, Jan 30, 2009 at 9:29 AM, Magnus Persson
magnus.pers...@phmp.sewrote:
Did you try to tune the bias constant at all or just took the one I posted?
I wrote it from memory and I believe it is in the range of possibly good
values, but it is
At the moment I'm also tuning this constant in the range 0.001-0.1. Given
uniformly random (light) playouts, is the bias expected to be bigger than with
heavy playouts, meaning the constant has to be bigger aswell?
--
Pt! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen:
On Wed, Jan 28, 2009 at 10:01 PM, Isaac Deutsch i...@gmx.ch wrote:
Hi again ;)
I found some time to actually implement this stuff. And, this has raised
some small questions. In this part of the code:
for (j = index; j moves_played_in_tree.size(); j += 2) {
//stuff
}
for (j =
Hi again ;)
I found some time to actually implement this stuff. And, this has raised
some small questions. In this part of the code:
for (j = index; j moves_played_in_tree.size(); j += 2) {
//stuff
}
for (j = 0; j moves_played_out_tree.size(); ++j) {
//more stuff
// If
On Sat, Jan 17, 2009 at 08:29:32PM +0100, Sylvain Gelly wrote:
ChooseMove(node, board) {
bias = 0.015 // I put a random number here, to be tuned
b = bias * bias / 0.25
best_value = -1
best_move = PASSMOVE
for (move in board.allmoves) {
c = node.child(move).counts
w =
Quoting Thomas Lavergne thomas.laver...@reveurs.org:
- the best play is a good only if played immediatly and very bad if
played later in the game :
- the first playout for this play resulted in a lost.
score and RAVE score will be very low and this play will never be
considered again
On Jan 21, 2009, at 10:23 AM, Magnus Persson wrote:
Quoting Thomas Lavergne thomas.laver...@reveurs.org:
- the best play is a good only if played immediatly and very bad if
played later in the game :
- the first playout for this play resulted in a lost.
score and RAVE score will be very
Well, empirically, when I set the exploration component to zero it starts
to play a lot worse. Like I wrote: the winning percentage drops to 24% vs.
the same program with the exploration component, which is a huge difference.
So if you have a different experience, you must have something
On Jan 21, 2009, at 11:53 AM, Olivier Teytaud wrote:
Here, we have a non-zero initialization of the number of wins, of
the numbere of simulations, of the number of Rave-wins, of the
number of Rave-losses.
We have then a 0 constant for exploration, but also an exploratory
term which is
I'd like to make sure I understand what you mean exactly. You use some
heuristics to intialize all the moves (or maybe some of the moves) with a
certain win-loss and rave-win-loss ratios?
Not only ratios, but also numbers of simulations. Thanks to patterns, expert
rules.
To a certain
Hi,
I did not mention here the prior initialization that is done in each node.
When you create a node, you can look at all possible move and if a pattern
matches (the exact same as in the playout) you initialize rw and rc to 14.
If the move saves a capture (same as in the playout), same
Of course you can do put much more clever prior if you are a player and
know the subtleties of the game.
E.g. patterns extracted from databases - but it's not enough, carefully tune
the coefficients for empty triangles (important!) and various other
importants patterns/rules (don't just keep
2009/1/21 Olivier Teytaud olivier.teyt...@lri.fr
Of course you can do put much more clever prior if you are a player and
know the subtleties of the game.
E.g. patterns extracted from databases - but it's not enough, carefully
tune the coefficients for empty triangles (important!) and
Well, now mogo has an exploration term - but not at all UCB-like.
I was talking about times where I was still there ... ages ago :)
Good old times :-)
you've been helpful several times even from far away :-)
___
computer-go mailing list
I think it is too expensive to read ladders during playouts. I remember
that you have faster ladders search code so it might not cost you as much.
My playout code has no ability to undo a move or do any kind of lookahead.
David
Some examples: David Fotland wrote he does light playouts with
On Jan 18, 2009, at 4:11 PM, David Fotland wrote:
I think it is too expensive to read ladders during playouts. I
remember
that you have faster ladders search code so it might not cost you as
much.
My playout code has no ability to undo a move or do any kind of
lookahead.
Yes, my ladder
On Jan 18, 2009, at 5:38 PM, Magnus Persson wrote:
In Valkyria I solved this by playing out the ladder on the playout
board, and store all changes on a stack. When the ladder undos moves
it just pop changes from the stack. In this way I can also use the
rich board representation of
Hi,
Sorry for the slow reply.
After those discussions, I figured out that pseudo code was the
fastest clear and not ambiguous way to describe how to precisely
implement the algorithm. I needed to find some time to write it.
Note that I did not write only the backup phase because to clearly
Hi,
Sorry for the slow reply.
Hi
I'd prefer quality over speed anytime. ;)
Your pseudo code is excellent and helped me to understand some of the trickier
things. Thanks again!
I think I will now be able to implement my own version. :)
Regards,
Isaac
--
Pt! Schon vom neuen GMX
Thanks for taking the time Sylvain. I took a quick look to see how
your pseudo-code compared to the reference implementation I made.
There are a few small differences, and I think the place(s) where I
deviated is exactly what confused Daniel Waeber.
The first difference is that when I
A small point: in PlayoutOutTree, just after if
(!played.AlreadyPlayed(move)) {, there should have a played.Play(move).
I believe it does not change the final result (as the check is also done in
the backup, and the move played in the backup), but I simply forgot that
line (that should make
Hi Mark,
For the first difference you mention, as far as I remember it makes a small
but significant difference and is one of the main reason I was talking about
tricky details.
For the second difference, I also don't want to count a move if the
opposite color played on that point first, and,
Hi Issac,
You are welcome, and I am happy there is finally a clearer of implementing
RAVE out there. I believe I should have done it much earlier, sorry for
that, but better late than never, no? :)
Best,
Sylvain
2009/1/17 Isaac Deutsch i...@gmx.ch
Hi,
Sorry for the slow reply.
Hi
I'd
I think I found a bug in ChooseMove
Quoting Sylvain Gelly sylvain.ge...@m4x.org:
coefficient = 1 - rc * (rc + c + rc * c * b)
I think this has to be
coefficient = 1 - rc / (rc + c + rc * c * b)
thus when c is 0 (initially) the coefficient is 0
when c goes towards infinity the
Good catch :)Indeed it makes no sense with the *, sorry...
Sylvain
2009/1/17 Magnus Persson magnus.pers...@phmp.se
I think I found a bug in ChooseMove
Quoting Sylvain Gelly sylvain.ge...@m4x.org:
coefficient = 1 - rc * (rc + c + rc * c * b)
I think this has to be
coefficient = 1 -
On Jan 17, 2009, at 5:41 PM, Sylvain Gelly wrote:
For the first difference you mention, as far as I remember it makes
a small but significant difference and is one of the main reason I
was talking about tricky details.
OK, I ran a test and after 1,000 games with 2K semi-light playouts I
Thanks you. I think that I understand it now :)
On 23:21 Wed 14 Jan , Mark Boon wrote:
You have to understand that the 'start' variable really starts at the
root from the position for which we do the search. So all the moves
'below' the playoutNode are also taken into account. The
On Jan 15, 2009, at 10:47 AM, Daniel Waeber wrote:
Thanks you. I think that I understand it now :)
On 23:21 Wed 14 Jan , Mark Boon wrote:
You have to understand that the 'start' variable really starts at the
root from the position for which we do the search. So all the moves
'below' the
Hi,
On 11:24 Thu 15 Jan , Mark Boon wrote:
On Jan 15, 2009, at 10:47 AM, Daniel Waeber wrote:
Thanks you. I think that I understand it now :)
On 23:21 Wed 14 Jan , Mark Boon wrote:
You have to understand that the 'start' variable really starts at the
root from the position
On Jan 15, 2009, at 12:33 PM, Daniel Waeber wrote:
yes, but the weight/color maps stay the same for all updated nodes.
I think the first playoutNode (the one most deep inside the tree) only
should get amaf values for the random playout, the next one form
random
playout + from the first
On 12:53 Thu 15 Jan , Mark Boon wrote:
On Jan 15, 2009, at 12:33 PM, Daniel Waeber wrote:
yes, but the weight/color maps stay the same for all updated nodes.
I think the first playoutNode (the one most deep inside the tree) only
should get amaf values for the random playout, the
Hi,
I'm also interested at the same thing.
snip
I tried putting this into pseudo code, but it already looks like C. ;p
http://pastie.org/357231
If you could look at it, I would be most grateful.
It sounds good but it seems to lack the check of whether a given move is
first played in
On Jan 14, 2009, at 9:36 AM, Daniel Waeber wrote:
I have a question about the the part were the stats are updated.
(l.15-25). having an array of amaf-values in every node seems very
memory
intensive and I don't get how you would access these values.
You are right, it is memory intensive.
Hi,
I'm also interested at the same thing.
Glad I'm not alone. ;)
It sounds good but it seems to lack the check of whether a given move is
first played in a given intersection. When you add that, it because a
little
more tricky to update in the tree.
I see, I missed that. I actually
Ok, I still have same questions about the refbot code.
On 10:29 Wed 14 Jan , Mark Boon wrote:
On Jan 14, 2009, at 9:36 AM, Daniel Waeber wrote:
I have a question about the the part were the stats are updated.
(l.15-25). having an array of amaf-values in every node seems very
On Jan 14, 2009, at 10:55 PM, Daniel Waeber wrote:
Accessing the AMAF values depends on your implementation. The
following is a code-snippet from my MCTS reference implementation
that
updates the AMAF values in the tree:
if (_useAMAF)
{
IntStack playoutMoves =
2009/1/10 Isaac Deutsch i...@gmx.ch
Hi Sylvain,
I think it's starting to make sense now. :-)
Sorry to be unclear. I wish we have a white board where we could discuss
and
that would sorted out in a few minutes :).
Several results turn up in a google search ;p
Hi Sylvain,
I think it's starting to make sense now. :-)
Sorry to be unclear. I wish we have a white board where we could discuss
and
that would sorted out in a few minutes :).
Several results turn up in a google search ;p
http://www.google.com/search?q=online+white+board
What I tried to
I'm a bit confused about the difference between AMAF and RAVE (if there's one).
As far as I understand, with AMAF, you sample in each playout (after it leaves
the tree) the moves played (by both players), but only the first move at any
position, then you reward all moves played either by a win or
Hi Isaac,
in a nutshell RAVE is basically AMAF adapted for Monte Carlo Tree Search.
The original paper describing it is
http://www.machinelearning.org/proceedings/icml2007/papers/387.pdf and a
paper for broader audience can be found here:
http://www.lri.fr/~gelly/paper/MoGoNectar.pdf (the picture
Hi Sylvain,
Thanks for your quick answer.
in a nutshell RAVE is basically AMAF adapted for Monte Carlo Tree Search.
The original paper describing it is
http://www.machinelearning.org/proceedings/icml2007/papers/387.pdf and a
paper for broader audience can be found here:
Hi Isaac,
2009/1/9 Isaac Deutsch i...@gmx.ch
Hi Sylvain,
Thanks for your quick answer.
in a nutshell RAVE is basically AMAF adapted for Monte Carlo Tree Search.
The original paper describing it is
http://www.machinelearning.org/proceedings/icml2007/papers/387.pdf and a
paper for
74 matches
Mail list logo