Re: [Computer-go] Fwd: News on Tromp-Cook ?

2011-01-03 Thread Petr Baudis
On Mon, Jan 03, 2011 at 08:24:43AM +0800, Aja wrote:
 Zen uses sequence-like AND probabilistic simulation.
 Basically it plays around the previous move randomly like MoGo, and these
 moves are biased by gamma values like Crazy Stone.
 
 I am also trying to use probabilistic simulation on the whole board, but
 it does not yet succeed. The main problem is how to combine the semeai,
 seki and useless-move detection with it.
 
   Very surprised that Zen is using partly sequence-like simulation
 like Mogo. Erica uses probabilistic simulation completely. I am not
 sure if my implementation is good enough to be called succeed ,
 but I am sure that you will succeed.

Hmm, I thought based on your paper that Erica uses a sort of a hybrid -
probabilistic simulation completely, but with all but 3x3 feature filled
in only around the last move anyway. So, one could say, Zen-like, but
with 3x3-based tenuki sampling or whatever you want to call it. :-)
Has this changed recently?

Petr Pasky Baudis
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] Fwd: News on Tromp-Cook ?

2011-01-03 Thread Aja

 Hi Petr,


Hmm, I thought based on your paper that Erica uses a sort of a hybrid -
probabilistic simulation completely, but with all but 3x3 feature filled
in only around the last move anyway. So, one could say, Zen-like, but
with 3x3-based tenuki sampling or whatever you want to call it. :-)
Has this changed recently?


  No, at the time of our simulation balancing paper, Erica was using 
probabilistic simulation completely, without seki knowledge. That also 
surprised Remi and me much, that balanced simulation is so powerful: 
achieved 80% against Fuego 0.4 without dealing with seki completely. I don't 
understand what do you mean by 3x3-based tenuki sampling. Right now Erica 
is using a difference approach. I have already described to Yamato in 
private.


 Aja 


___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] Fwd: News on Tromp-Cook ?

2011-01-02 Thread Olivier Teytaud
Hi;


 C*RAVE+(1-C)*UCT


 This has been my understanding. However, I am surprized to find out that
 people have been setting C close to one, according to Petr and Oliver's
 postings, which is essentially AMAF. MF apparently is doing something
 different.


As explained by Aja, I did not mean that; I meant that we use UCT with a
coefficient 0 in
front of the UCB term +C_{uct} sqrt ( log( ...) / ...).

Best regards,
Olivier
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] Fwd: News on Tromp-Cook ?

2011-01-02 Thread Aja
Hi Olivier,

Thanks for all your corrections and detailed replies. I agree with them. If you 
think it's OK, I would like to describe Mogo's contributions in my thesis 
according to what you enumerated and explained.

Considering and responding to the previous move in simulation, especially 
seeing the 3x3 patterns around (default policy of Mogo) is indeed a big 
contribution (so called Mogo's magic formula). These days, programs of Crazy 
Stone thread, like Zen, Erica, Aya...etc, do stochastic simulation, different 
with Mogo-type, fixed-sequence simulation. But the most important features, 
like save by capturing, save by extension are indeed inspired from Mogo. If 
I realize correctly, it was invented by Yizao, who is also a KGS 2d Go player. 
It's REALLY a pity that Sylvain and Yizao left Mogo team so soon, after 
Computer Olympiad 2007.

I have only one more question. You mentioned this,

--
the idea of balancing - the situation should not be better for one of the two 
players after one move by each player. This was claimed already in Sylvain's 
thesis and (I think) earlier than that.
--

Can you point out where did Sylvain claimed the idea of balancing of 
simulation in his thesis 
(http://www.lri.fr/~gelly/paper/SylvainGellyThesis.pdf)? I check the thesis, 
Sylvain talked about balancing of exploration and exploitation, but not 
simulation (maybe he was not using such wording balancing?). David also 
didn't cite his thesis in his simulation balanacing paper. If Sylvain really 
proposed such concept, I would like to cite it in my thesis.

Thanks,
Aja


___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] Fwd: News on Tromp-Cook ?

2011-01-02 Thread Olivier Teytaud
 These days, programs of Crazy Stone thread, like Zen, Erica, Aya...etc, do
 stochastic simulation, different with Mogo-type, fixed-sequence simulation.


I think CrazyStone's update formula allows more flexibility than the
pattern-based approach of MoGo. So it has better potential;
I have no idea of whether one can really benefit from this flexibility, but
there is a nice flexibility, without big computational cost,
and this representation (by marginal probabilities) is a good algorithmic
choice in CrazyStone and related programs. On the other hand, the
probability involved in the Monte-Carlo part, as originally defined in
CrazyStone, were not very efficient, in spite of the fact that they play Go
better (still the apparent contradiction that a Monte-Carlo which plays well
is not necessarily a good part of a MCTS). The modified versions were, I
think, influenced by MoGo's patterns as defined by Yizao (I think other
persons helped a lot for this, for the experimental setup and the
organization of the code - I consider Sylvain as important as Yizao for this
even if Sylvain did not design the patterns - I only consider myself of very
moderate influence :-) ).

We could win a lot on the Monte-Carlo part with fillboard and nakade and
a few handcrafted patterns. No idea of whether better
formula could be designed using the more flexible formalism of
CrazyStone-like formula.



 Can you point out where did Sylvain claimed the idea of balancing of
 simulation in his thesis (
 http://www.lri.fr/~gelly/paper/SylvainGellyThesis.pdfhttp://www.lri.fr/%7Egelly/paper/SylvainGellyThesis.pdf)?
 I check the thesis, Sylvain talked about balancing of exploration and
 exploitation, but not simulation (maybe he was not using such wording
 balancing?). David also didn't cite his thesis in his simulation
 balanacing paper. If Sylvain really proposed such concept, I would like to
 cite it in my thesis.



Sylvain posted in Computer-Go an explanation of that a long time ago - but
without the word balancing, probably. See page 154.

Best regards,
Olivier
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] Fwd: News on Tromp-Cook ?

2011-01-02 Thread Aja
Hi Olivier,

I forgot that you also contributed to Mogo's pattern design, besides Sylvain 
and Yizao. Sorry for neglecting your contribution. :) 

Actually I make fillboard and nakade as features in Erica and they work 
fine. But indeed, efficiency is a problem in our probabilistic simulation.

Thanks for reminding of Sylvain's claim. Page 154 is right the page.

Aja___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] Fwd: News on Tromp-Cook ?

2011-01-02 Thread Olivier Teytaud
 Sorry for neglecting your contribution. :)


You can neglect mine, no problem with that, it's a reality :-)


 Actually I make fillboard and nakade as features in Erica and they work
 fine. But indeed, efficiency is a problem in our probabilistic simulation.


For us fillboard = around 80% with reasonnable time settings (increasing
with comp. time);
I'm afraid the improvement is smaller in other implementations (and I don't
know why).

Olivier
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] Fwd: News on Tromp-Cook ?

2011-01-02 Thread Fuming Wang
Hi Aja,


On Sun, Jan 2, 2011 at 1:29 AM, Aja ajahu...@gmail.com wrote:

  Hi Fuming,

  C*RAVE+(1-C)*UCT

 C is computed dynamically in search, but not set to a fixed value. Maybe
 you mean UCT_C,

 UCT=UCT_mean+UCT_C*exploration_term

 What Petr and Olivier do, I think, is set UCT_C to 0, to disable the
 exploration_term, not the weight of RAVE.


Without the exploring term, the UCT is just mean win rate, so there's no
point in calling it UCT or UCB. Basically, what people have been saying is
that currently the tree search is based on combination of sequence dependent
rate (average win rate) and sequence independent/almost independent (rave
rate) instead of combination of exploitation (win rate) and exploration (UCB
term). Is this understanding close?

Fuming
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] Fwd: News on Tromp-Cook ?

2011-01-02 Thread Aja
80% is a big improvement. Indeed, I don't get that much in Erica.

Aja
  - Original Message - 
  From: Olivier Teytaud 
  To: Aja ; computer-go 
  Sent: Sunday, January 02, 2011 9:04 PM
  Subject: Re: [Computer-go] Fwd: News on Tromp-Cook ?




Sorry for neglecting your contribution. :) 

  You can neglect mine, no problem with that, it's a reality :-)  



Actually I make fillboard and nakade as features in Erica and they work 
fine. But indeed, efficiency is a problem in our probabilistic simulation.

  For us fillboard = around 80% with reasonnable time settings (increasing with 
comp. time); 
  I'm afraid the improvement is smaller in other implementations (and I don't 
know why). 

  Olivier




___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] Fwd: News on Tromp-Cook ?

2011-01-02 Thread Aja
Hi Fuming,

I agree with your understanding. As for the naming of UCT, I think it's fine, 
because we can just view such formula as assigned with zero coefficient of the 
exploration term.

Aja

  - Original Message - 
  From: Fuming Wang 
  To: Aja 
  Cc: computer-go@dvandva.org 
  Sent: Sunday, January 02, 2011 10:35 PM
  Subject: Re: [Computer-go] Fwd: News on Tromp-Cook ?


  Hi Aja,



  On Sun, Jan 2, 2011 at 1:29 AM, Aja ajahu...@gmail.com wrote:

Hi Fuming,
 
C*RAVE+(1-C)*UCT
 
C is computed dynamically in search, but not set to a fixed value. Maybe 
you mean UCT_C,
 
UCT=UCT_mean+UCT_C*exploration_term
 
What Petr and Olivier do, I think, is set UCT_C to 0, to disable the 
exploration_term, not the weight of RAVE. 



  Without the exploring term, the UCT is just mean win rate, so there's no 
point in calling it UCT or UCB. Basically, what people have been saying is that 
currently the tree search is based on combination of sequence dependent rate 
(average win rate) and sequence independent/almost independent (rave rate) 
instead of combination of exploitation (win rate) and exploration (UCB term). 
Is this understanding close?

  Fuming 


___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] Fwd: News on Tromp-Cook ?

2011-01-02 Thread Aja

Yamato and I have independently started development of Zen and FudoGo
(resp) with sharing the same idea*, combining MoGo's sequence-like
simulation policy and CrazyStone's larger patterns, though later Zen
uses more complicated policy.

*This is one of the reasons I have started the joint project DeepZen
with Yamato.


  If Zen is not using probabilisic simulation (even with a lot of rules to 
make it less-random, like I do in Erica), I will be very surprised with its 
performance in few simulations. But anyway, I feel sorry to say anything 
about Zen that I am not really sure.


 Aja 


___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] Fwd: News on Tromp-Cook ?

2011-01-02 Thread Erik van der Werf
On Sun, Jan 2, 2011 at 10:31 AM, Olivier Teytaud olivier.teyt...@lri.fr wrote:
 mogo's MC was implemented in many
 strong programs, and influenced the MC of all other
 strong programs. I think there's no exception to this. I was personally of
 little influence on that, but I was here for clearly seeing that
 this is a big contribution.

The playout policy I used in the 2007 version of Steenvreter was
developed independently of the Mogo policy. However, some time after
the olympiad I also implemented Mogo's version for comparison. IIRC
the policy with Yizao's patterns was of comparable quality at equal
numbers of simulations but a bit faster so therefore over-all better.

I think Yizao's policy is the primary reason why Mogo got strong so
fast in 2006.

Erik
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] Fwd: News on Tromp-Cook ?

2011-01-02 Thread Yamato
Aja wrote:
 Yamato and I have independently started development of Zen and FudoGo
 (resp) with sharing the same idea*, combining MoGo's sequence-like
 simulation policy and CrazyStone's larger patterns, though later Zen
 uses more complicated policy.

 *This is one of the reasons I have started the joint project DeepZen
 with Yamato.

   If Zen is not using probabilisic simulation (even with a lot of rules to 
make it less-random, like I do in Erica), I will be very surprised with its 
performance in few simulations. But anyway, I feel sorry to say anything 
about Zen that I am not really sure.

Zen uses sequence-like AND probabilistic simulation.
Basically it plays around the previous move randomly like MoGo, and these
moves are biased by gamma values like Crazy Stone.

I am also trying to use probabilistic simulation on the whole board, but
it does not yet succeed. The main problem is how to combine the semeai,
seki and useless-move detection with it.

--
Yamato
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] Fwd: News on Tromp-Cook ?

2011-01-02 Thread Erik van der Werf
On Sun, Jan 2, 2011 at 4:41 PM, Olivier Teytaud olivier.teyt...@lri.fr wrote:
 The playout policy I used in the 2007 version of Steenvreter was
 developed independently of the Mogo policy. However, some time after
 the olympiad I also implemented Mogo's version for comparison. IIRC
 the policy with Yizao's patterns was of comparable quality at equal
 numbers of simulations but a bit faster so therefore over-all better.


 Thanks for this interesting comment. Is your policy described somewhere, and
 was this comparison in 9x9 or 13x13 or 19x19 ?

No, I never published anything on it. The comparison was only for 9x9.

Erik
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] Fwd: News on Tromp-Cook ?

2011-01-02 Thread Jeff Nowakowski

On 01/02/2011 10:17 AM, Erik van der Werf wrote:


The playout policy I used in the 2007 version of Steenvreter was
developed independently of the Mogo policy.


Did this policy include the idea of sequences (playing near the last 
move), and if so, was that developed independently?


Memories are notoriously faulty. Priority has to be given to published 
papers and established results. It is not enough to say some months 
after MoGo established itself and published the ideas behind the 
implementation that you would have gotten there on your own with 
Steenvreter.


___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] Fwd: News on Tromp-Cook ?

2011-01-02 Thread Erik van der Werf
On Sun, Jan 2, 2011 at 8:52 PM, Jeff Nowakowski j...@dilacero.org wrote:
 On 01/02/2011 10:17 AM, Erik van der Werf wrote:

 The playout policy I used in the 2007 version of Steenvreter was
 developed independently of the Mogo policy.

 Did this policy include the idea of sequences (playing near the last move),
 and if so, was that developed independently?

Yes, the probabilities for playing near the last move were higher.
This was btw a feature I already had in my move predictor back in 2001
(which was published at CG 2002).


 Memories are notoriously faulty.

Memory is not an issue here. I can simply check my code at that time
because I have it archived and under version control (and this is
exactly what I did because I was also wondering when exactly I had
done the comparison with Mogo's policy).


 Priority has to be given to published papers and established results.

Agreed, as long as you don't deny things that were out there long
before someone wrote 'the' paper. I have seen a rather natural
progression from work of Brugman, Kaminski, Bouzy, Helmstetter,
Hamlen, etc. to where we are today. Pinpointing the invention of MCTS
in time or to a single person is simply not that easy.


 It is not enough to say some months after
 MoGo established itself and published the ideas behind the implementation
 that you would have gotten there on your own with Steenvreter.

My reason for posting was not to brag. It was my own choice not to
publish, so I accept it when laymen forget about Steenvreter. However,
I think a guy like Jan Willemson, who's now almost entirely written
out of the history of MCTS, really deserved some credit.

Erik
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] Fwd: News on Tromp-Cook ?

2011-01-02 Thread Aja

 Hi Erik,


My reason for posting was not to brag. It was my own choice not to
publish, so I accept it when laymen forget about Steenvreter. However,
I think a guy like Jan Willemson, who's now almost entirely written
out of the history of MCTS, really deserved some credit.


  Many people will be very interested if you publish something about 
Steenvreter. Beating Mogo without learning Mogo's paper at that time? That's 
incredible. :)


 Aja 


___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] Fwd: News on Tromp-Cook ?

2011-01-02 Thread Erik van der Werf
Hi Aja,

On Mon, Jan 3, 2011 at 1:28 AM, Aja ajahu...@gmail.com wrote:
  Many people will be very interested if you publish something about
 Steenvreter. Beating Mogo without learning Mogo's paper at that time? That's
 incredible. :)

Well, that's not exactly true either. I did of course read some of
their reports before the 2007 olympiad and some of it even directly
contributed. One example is UCB-tuned. Another is how they
parallellized the search. Also indirectly, discussions with various
people and ideas that came by on this list must have influenced me
somehow. The Mogo team definitely did some great work.

The playouts just happened to be something for which I already had my
own code. Perhaps it was more important that I had a bit different
philosophy on what the playouts were supposed to do than most other
people at that time. I wanted my playouts to evaluate positions well
(and not necessarily play strong moves). So that's what I optimized
them for, focusing especially on errors in human final positions,
which was easy because I already had a large collection of scored 9x9
games from my time in Maastricht. As a consequence Steenvreter got
most of the nakade and seki issues right long before others figured
those things out...

Best,
Erik
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] Fwd: News on Tromp-Cook ?

2011-01-02 Thread Jeff Nowakowski

On 01/02/2011 05:24 PM, Erik van der Werf wrote:


Agreed, as long as you don't deny things that were out there long
before someone wrote 'the' paper. I have seen a rather natural
progression from work of Brugman, Kaminski, Bouzy, Helmstetter,
Hamlen, etc. to where we are today. Pinpointing the invention of MCTS
in time or to a single person is simply not that easy.


I agree completely, and stated as such in my first message that touched 
off this long thread (and repeated many times since). I never credited 
MoGo with inventing MCTS or UCT.


___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] Fwd: News on Tromp-Cook ?

2011-01-02 Thread Yamato
Aja wrote:
 Zen uses sequence-like AND probabilistic simulation.
 Basically it plays around the previous move randomly like MoGo, and these
 moves are biased by gamma values like Crazy Stone.

 I am also trying to use probabilistic simulation on the whole board, but
 it does not yet succeed. The main problem is how to combine the semeai,
 seki and useless-move detection with it.

   Very surprised that Zen is using partly sequence-like simulation like 
Mogo. Erica uses probabilistic simulation completely. I am not sure if my 
implementation is good enough to be called succeed , but I am sure that 
you will succeed.

I suppose that probabilistic simulation on the whole board is not suitable
for some sort of heuristics, like seki, long semeai or ladder-safe.
Does Erica handle those in the playouts correctly?

--
Yamato
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] Fwd: News on Tromp-Cook ?

2011-01-02 Thread Aja

Hi Erik,

From your words, I think the main reason that Steenvreter defeated Mogo and 
Crazy Stone is that Steenvreter handled life and death, seki and nakade much 
better than them. Still, many people will be interested if you publish 
something about I had a bit different philosophy on what the playouts were 
supposed to do than most other people at that time. :)


Aja

- Original Message - 
From: Erik van der Werf erikvanderw...@gmail.com

To: Aja ajahu...@gmail.com; computer-go@dvandva.org
Sent: Monday, January 03, 2011 9:16 AM
Subject: Re: [Computer-go] Fwd: News on Tromp-Cook ?


Hi Aja,

On Mon, Jan 3, 2011 at 1:28 AM, Aja ajahu...@gmail.com wrote:

Many people will be very interested if you publish something about
Steenvreter. Beating Mogo without learning Mogo's paper at that time? 
That's

incredible. :)


Well, that's not exactly true either. I did of course read some of
their reports before the 2007 olympiad and some of it even directly
contributed. One example is UCB-tuned. Another is how they
parallellized the search. Also indirectly, discussions with various
people and ideas that came by on this list must have influenced me
somehow. The Mogo team definitely did some great work.

The playouts just happened to be something for which I already had my
own code. Perhaps it was more important that I had a bit different
philosophy on what the playouts were supposed to do than most other
people at that time. I wanted my playouts to evaluate positions well
(and not necessarily play strong moves). So that's what I optimized
them for, focusing especially on errors in human final positions,
which was easy because I already had a large collection of scored 9x9
games from my time in Maastricht. As a consequence Steenvreter got
most of the nakade and seki issues right long before others figured
those things out...

Best,
Erik 


___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] Fwd: News on Tromp-Cook ?

2011-01-02 Thread Yamato
Aja wrote:
 I suppose that probabilistic simulation on the whole board is not suitable
 for some sort of heuristics, like seki, long semeai or ladder-safe.
 Does Erica handle those in the playouts correctly?

It's not true. Erica handle seki, semeai and ladder in the playouts combined 
with probabilistic simulation. Everything Mogo-type, fixed-sequence 
simulation can do, should also be able to be done in probabilistic 
simulation. Probabilistic simulation has more flexibility, as Olivier said.

That means, you track seki or semeai status incrementally?
Or, do you use a resampling algorithm that avoids to play seki-dame or
something useless?

--
Yamato
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] Fwd: News on Tromp-Cook ?

2011-01-01 Thread Aja
Hi Fuming,

Most of the current strong programs are using UCT combined with RAVE (a kind of 
AMAF). The formula is like this (there are many variants),

C*RAVE+(1-C)*UCT

C is the weight of RAVE. As far as I know, there are at least two useful 
formula to compute C:
1. The first formula was proposed in the famous paper Combining Online and 
Offline Knowledge in UCT of Sylvain G. and David S (please refer to Section 
6).(http://www.machinelearning.org/proceedings/icml2007/papers/387.pdf).

2. The second (newer) was posted here in the past. Formal reference will be 
David Silver's PhD thesis. This new formula, according to my testing, is 70 elo 
stronger than the first one.

RAVE is really a big invention. It's a big contribution of Mogo. We must thank 
Sylvain and David for bringing such powerful method to us. :)

Aja

  - Original Message - 
  From: Fuming Wang 
  To: computer-go@dvandva.org 
  Sent: Saturday, January 01, 2011 10:16 PM
  Subject: Re: [Computer-go] Fwd: News on Tromp-Cook ?


  So, the current strong programs are more like AMAF instead of UCT, right?

  Fuming


  On Sat, Jan 1, 2011 at 11:32 AM, David Fotland fotl...@smart-games.com 
wrote:

I still have a UCB term, but it's probably because I depend more on Many
Face's move generator.  I have a rave term, but it's contribution is small.
It seems that if the RAVE term is large, then Rave creates enough
exploration by itself.


David

 -Original Message-
 From: computer-go-boun...@dvandva.org [mailto:computer-go-

 boun...@dvandva.org] On Behalf Of Petr Baudis
 Sent: Friday, December 31, 2010 6:27 PM
 To: computer-go@dvandva.org

 Subject: Re: [Computer-go] Fwd: News on Tromp-Cook ?

   Hi!

 On Fri, Dec 31, 2010 at 07:02:35PM +0800, Fuming Wang wrote:
  Now I know Remi is the first to utilize MCTS. Guess I need to read
 papers
  more carefully. I do have a question though. I thought UCT is the
 foundation
  of the current strong programs, I know that a RAVE term is added to the
  original UCB term, i.e. sqrt(t_total/t_i), but the UCB term is still
 there
  right? Could you eleborate a bit on why do you say UCT is not good for
 Go?
  This is quite contradictory to a lot of material on the internet
 regarding
  the lastest bread of go programs.

   Most likely not all (e.g. it seems not ManyFaces?), but at least many
 programs use exploration coefficients that are either zero or negligibly
 small.

   In Pachi, I'm using 0 as the exploration coefficient in the end, it
 seems to work the best. But this probably also depends on the fact that
 I have slight forceful randomization of playouts. 0.02 can work well on
 9x9 too, but it also depends on the priors, etc.

   Overally, it is a question of the overall tuning of the program. But
 right now, reasonably strong play with only RAVE and no UCB1 is
 certainly possible.

 --
   Petr Pasky Baudis
 Computer science education cannot make an expert programmer any more
 than studying brushes and pigment can make an expert painter. --esr
 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go





--


  ___
  Computer-go mailing list
  Computer-go@dvandva.org
  http://dvandva.org/cgi-bin/mailman/listinfo/computer-go___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] Fwd: News on Tromp-Cook ?

2010-12-31 Thread Olivier Teytaud
Dear all,
  the original MCTS paper is by Rémi Coulom (to the best of my knowledge at
least...). It's clear for us that we did not invent MCTS
and always referenced Remi's paper.

*1) For UCB-like formula:*
- On the theoretical side, the consistency proof of MCTS without the
UCT-like exploration also comes from mogo-people (Berthier et al); instead
of saying that mogo's contribution is the introduction of UCT (which is good
for other games but not for Go), I would
have said that mogo's contribution is the analysis of MCTS without UCT (even
if MCTS without UCT existed before mogo).
- I'd like to point out that UCB-formulas are, I think, good for games with
random part (e.g. with random transition, or with hidden information which
leads to randomized strategies). But not for Go :-)

*2) For works on the Monte-Carlo part:*
For MoGo's contributions on the Monte-Carlo part, in particular with Yizao's
patterns (with other people as well). There
was a significant difference with previous attempts of designing good
Monte-Carlo parts in the sense that a good Monte-Carlo part
is not a MC-part which plays well as a standalone player, but a MC-part
which plays well with a MCTS on top of it; this is unfortunately not very
convenient as a criterion for designing a MC part... I think the main idea
was the idea of balancing - the situation should not be better for one of
the two players after one move by each player. This was claimed already in
Sylvain's thesis and (I think) earlier than that.

*3) Other mogo's contributions (I might forget many things...)*
are around the fillboard option (which has a great impact for us on MoGo in
19x19), the nakade (maybe there were other simultaneous published methods
for that),
and the RAVE part (Brugmann, Gelly, Silver - Aja said that RAVE was invented
by David and I have no idea on that, but I'm sure Sylvain contributed a lot
on this and Brugmann did something ),
the parallelization (Tristan Cazenave and others have published similar
ideas; the Bourki et al paper has shown clear limitations in terms of
scalability and counter-examples), the automatic building of
patterns by direct policy search (J.-B. Hoock's papers) which can be used
far from Go,
 the simultaneous use of
patterns designed by supervised learning, patterns designed by policy
search,
rave values, expert knowledge with a dirty complicated formula :-)

MoGo was also, I think, the first use of never-ending learning for designing
automatically an opening book by MCTS (this was
moderately good at first because we did not want to use expert knowledge at
all, whereas human expertise was really necessary
for guiding the search...), after months of work on a grid this provides
very good results for 9x9 Go.

The parallelization is only efficient for moderate time settings - otherwise
we have the scalability plateau. For games with very expensive transitions
it might be different...

Best regards,
Olivier
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] Fwd: News on Tromp-Cook ?

2010-12-31 Thread Rémi Coulom
Hi,

I'd like to advise against using the exact algorithm I described in my 2006 
paper. I compared it to UCT at that time, and UCT performed better. I am sorry 
I don't have a reference to my data any more. I posted the results to the 
mailing list. It used to be archived at that link:
http://computer-go.org/pipermail/computer-go/2006-July/005737.html
I hope somebody kept an archive an can forward it to the list. The title of 
that message was Experiments with UCT. It is too bad that the archive of that 
time is gone. Is it available anywhere? The google archive starts later.

The biggest difference between the general MCTS idea I proposed and the 
UCT-like algorithms everybody is using today is that the backup operator was 
not the mean of playouts. I tried to consider selectivity and backup 
independently. I still believe it is a powerful idea. My feeling is that a good 
backup operator should be able to make a good decision independently of 
selectivity. That is to say, even if all the moves are explored uniformly, the 
backup operator should converge to min/max. Backing up the mean forces current 
programs to be extremely selective so that they rapidly converges to min/max.

The only part of the tree where it is not important to converge rapidly to 
min/max is the root. I am aware of some experiments that indicate that being 
less selective at the root may improve strength. The efficiency of root 
parallelization may also be an indication that current algorithms are too 
selective at the root. So my vague intuition is that it may be worth trying to 
find a good backup operator that works with less selective search.

Anyway, tweaking the MCTS formula you are using will never make your program 
very strong. Clever playouts are immensely more important, especially for 19x19.

Rémi

On 31 déc. 2010, at 07:02, Fuming Wang wrote:

 
 
 -- Forwarded message --
 From: Fuming Wang fuming...@gmail.com
 Date: Fri, Dec 31, 2010 at 1:50 PM
 Subject: Re: [Computer-go] News on Tromp-Cook ?
 To: Aja ajahu...@gmail.com
 
 
 Hi Aja,
 
 Remi and S. Gelly's paper both come out in 2006,and I just checked that they 
 did not reference each other. I just read Remi's paper again, and realized 
 that CrazyStone's tree search approach is actually different from the popular 
 UCT method. Similar to you, I haven't been able to get good results from the 
 popular UCT method, so I might try CrazyStone's method for a change. In 
 Remi's paper, CrazyStone is only having around 30% winning rate against Gnu 
 Go 3.6, and now Erica is winning world competitions,this actually proves that 
 high quality MC simulation (realized for the first time in MoGo) is more 
 important than tree search algorithms.
 
 Best regards,
 Fuming
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] Fwd: News on Tromp-Cook ?

2010-12-31 Thread Robert Jasiek

On 31.12.2010 13:31, Rémi Coulom wrote:
 being less selective at the root may improve strength

So what about P; P ca. 3 ply, at each of these plies, a) firstly filter 
obviously bad moves, b) secondly consider each still available move's 
children by MC / UCT? Or, more generally, dynamically increase P subject 
to hardware power.


--
robert jasiek
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] Fwd: News on Tromp-Cook ?

2010-12-31 Thread Aja
Hi Olivier,

Thanks for your clarification. Apologize for my wrong description about the 
invention of RAVE. I show the mail next again,

Sorry, I might be wrong at RAVE. Maybe it should be: Sylvain proprosed the 
idea of RAVE and David Silver proposed a new formula for RAVE.

Aja

  - Original Message - 
  From: Olivier Teytaud 
  To: Aja ; computer-go 
  Sent: Friday, December 31, 2010 5:51 PM
  Subject: Re: [Computer-go] Fwd: News on Tromp-Cook ?




  Dear all,
the original MCTS paper is by Rémi Coulom (to the best of my knowledge at 
least...). It's clear for us that we did not invent MCTS
  and always referenced Remi's paper.

  1) For UCB-like formula:
  - On the theoretical side, the consistency proof of MCTS without the UCT-like 
exploration also comes from mogo-people (Berthier et al); instead of saying 
that mogo's contribution is the introduction of UCT (which is good for other 
games but not for Go), I would
  have said that mogo's contribution is the analysis of MCTS without UCT (even 
if MCTS without UCT existed before mogo).
  - I'd like to point out that UCB-formulas are, I think, good for games with 
random part (e.g. with random transition, or with hidden information which 
leads to randomized strategies). But not for Go :-)

  2) For works on the Monte-Carlo part:
  For MoGo's contributions on the Monte-Carlo part, in particular with Yizao's 
patterns (with other people as well). There
  was a significant difference with previous attempts of designing good 
Monte-Carlo parts in the sense that a good Monte-Carlo part
  is not a MC-part which plays well as a standalone player, but a MC-part which 
plays well with a MCTS on top of it; this is unfortunately not very convenient 
as a criterion for designing a MC part... I think the main idea was the idea of 
balancing - the situation should not be better for one of the two players after 
one move by each player. This was claimed already in Sylvain's thesis and (I 
think) earlier than that.

  3) Other mogo's contributions (I might forget many things...)
  are around the fillboard option (which has a great impact for us on MoGo in 
19x19), the nakade (maybe there were other simultaneous published methods for 
that),
  and the RAVE part (Brugmann, Gelly, Silver - Aja said that RAVE was invented 
by David and I have no idea on that, but I'm sure Sylvain contributed a lot on 
this and Brugmann did something ),
  the parallelization (Tristan Cazenave and others have published similar 
ideas; the Bourki et al paper has shown clear limitations in terms of 
scalability and counter-examples), the automatic building of
  patterns by direct policy search (J.-B. Hoock's papers) which can be used far 
from Go,
   the simultaneous use of
  patterns designed by supervised learning, patterns designed by policy search, 
  rave values, expert knowledge with a dirty complicated formula :-)

  MoGo was also, I think, the first use of never-ending learning for designing 
automatically an opening book by MCTS (this was
  moderately good at first because we did not want to use expert knowledge at 
all, whereas human expertise was really necessary
  for guiding the search...), after months of work on a grid this provides very 
good results for 9x9 Go.

  The parallelization is only efficient for moderate time settings - otherwise 
we have the scalability plateau. For games with very expensive transitions it 
might be different...

  Best regards,
  Olivier
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] Fwd: News on Tromp-Cook ?

2010-12-31 Thread Jeff Nowakowski

On 12/31/2010 07:31 AM, Rémi Coulom wrote:


I'd like to advise against using the exact algorithm I described in
my 2006 paper. I compared it to UCT at that time, and UCT performed
better. I am sorry I don't have a reference to my data any more. I
posted the results to the mailing list. It used to be archived at
that link:
http://computer-go.org/pipermail/computer-go/2006-July/005737.html I
hope somebody kept an archive an can forward it to the list. The
title of that message was Experiments with UCT. It is too bad that
the archive of that time is gone. Is it available anywhere? The
google archive starts later.


Here: http://thread.gmane.org/gmane.games.devel.go/6366

Archives and current messages for this list can be found at:

http://blog.gmane.org/gmane.games.devel.go

I don't like the web interface, however. It's very hard to browse
the threads for a particular month without a lot of clicking. The search
is OK, though.

What I do is use the traditional NTTP news interface to gmane.
Thunderbird (my newsreader) has an option to download all the messages, 
which is nice for browsing messages, both old and new.


___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] Fwd: News on Tromp-Cook ?

2010-12-31 Thread Fuming Wang
Hi Remi,

Thanks for the suggestions. Sorry for inaccuracies in my previous
statements. Now I have read your paper more carefully, I find in the
appendex many discussions related to improvements on playout move
selections. On another note, I find that formula you gave on variance
calculation very interesting. I am a little confused on the meaning of
symble sigma in the formula, is it the number of wins of S trials? If that's
the case, then the meaning of sigma_sup(2) can not be easily understood.

Best regards,

Fuming

On Fri, Dec 31, 2010 at 8:31 PM, Rémi Coulom remi.cou...@free.fr wrote:

 Hi,

 I'd like to advise against using the exact algorithm I described in my 2006
 paper. I compared it to UCT at that time, and UCT performed better. I am
 sorry I don't have a reference to my data any more. I posted the results to
 the mailing list. It used to be archived at that link:
 http://computer-go.org/pipermail/computer-go/2006-July/005737.html
 I hope somebody kept an archive an can forward it to the list. The title of
 that message was Experiments with UCT. It is too bad that the archive of
 that time is gone. Is it available anywhere? The google archive starts
 later.

 The biggest difference between the general MCTS idea I proposed and the
 UCT-like algorithms everybody is using today is that the backup operator was
 not the mean of playouts. I tried to consider selectivity and backup
 independently. I still believe it is a powerful idea. My feeling is that a
 good backup operator should be able to make a good decision independently of
 selectivity. That is to say, even if all the moves are explored uniformly,
 the backup operator should converge to min/max. Backing up the mean forces
 current programs to be extremely selective so that they rapidly converges to
 min/max.

 The only part of the tree where it is not important to converge rapidly to
 min/max is the root. I am aware of some experiments that indicate that being
 less selective at the root may improve strength. The efficiency of root
 parallelization may also be an indication that current algorithms are too
 selective at the root. So my vague intuition is that it may be worth trying
 to find a good backup operator that works with less selective search.

 Anyway, tweaking the MCTS formula you are using will never make your
 program very strong. Clever playouts are immensely more important,
 especially for 19x19.

 Rémi

 On 31 déc. 2010, at 07:02, Fuming Wang wrote:

 
 
  -- Forwarded message --
  From: Fuming Wang fuming...@gmail.com
  Date: Fri, Dec 31, 2010 at 1:50 PM
  Subject: Re: [Computer-go] News on Tromp-Cook ?
  To: Aja ajahu...@gmail.com
 
 
  Hi Aja,
 
  Remi and S. Gelly's paper both come out in 2006,and I just checked that
 they did not reference each other. I just read Remi's paper again, and
 realized that CrazyStone's tree search approach is actually different from
 the popular UCT method. Similar to you, I haven't been able to get good
 results from the popular UCT method, so I might try CrazyStone's method for
 a change. In Remi's paper, CrazyStone is only having around 30% winning rate
 against Gnu Go 3.6, and now Erica is winning world competitions,this
 actually proves that high quality MC simulation (realized for the first time
 in MoGo) is more important than tree search algorithms.
 
  Best regards,
  Fuming
 ___
 Computer-go mailing list
 Computer-go@dvandva.org
 http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] Fwd: News on Tromp-Cook ?

2010-12-30 Thread Aja
Hi Fuming,

Remi's CG2006 paper is published/released half an year earlier. CG2006 was on 
2006/5/29-31 at Turin, Italy. Mogo's paper 
(http://hal.inria.fr/docs/00/12/15/16/PDF/RR-6062.pdf) was released on 2006/11. 
That's why Mogo's paper cited Remi's paper (please see reference [9] in this 
report).

As for MCTS, except the UCB formula, can you point out what is the difference 
between Remi's MCTS and Mogo's UCT? In my understand, UCT is exactly a 
selection strategy of MCTS. This is how Chaslot classified in his paper 
Progressive Strategies for Monte-Carlo Tree Search.

Aja


- Original Message - 
  From: Fuming Wang 
  To: computer-go@dvandva.org 
  Sent: Friday, December 31, 2010 2:02 PM
  Subject: [Computer-go] Fwd: News on Tromp-Cook ?





  -- Forwarded message --
  From: Fuming Wang fuming...@gmail.com
  Date: Fri, Dec 31, 2010 at 1:50 PM
  Subject: Re: [Computer-go] News on Tromp-Cook ?
  To: Aja ajahu...@gmail.com


  Hi Aja,

  Remi and S. Gelly's paper both come out in 2006,and I just checked that they 
did not reference each other. I just read Remi's paper again, and realized that 
CrazyStone's tree search approach is actually different from the popular UCT 
method. Similar to you, I haven't been able to get good results from the 
popular UCT method, so I might try CrazyStone's method for a change. In Remi's 
paper, CrazyStone is only having around 30% winning rate against Gnu Go 3.6, 
and now Erica is winning world competitions,this actually proves that high 
quality MC simulation (realized for the first time in MoGo) is more important 
than tree search algorithms.

  Best regards,
  Fuming



  On Fri, Dec 31, 2010 at 12:41 PM, Aja ajahu...@gmail.com wrote:

Hi Fuming,

The idea of improving the quality of simulation is more earlier, than 
Mogo’s paper, in the Appendix A of Remi Coulom’s CG2006 paper “Efficient 
Selectivity and Backup Operators in Monte-Carlo Tree 
Search”(http://remi.coulom.free.fr/CG2006/CG2006.pdf):

The choice of a more clever probability distribution can improve the 
quality of the Monte-Carlo estimation

I am not sure if Remi was the first one proposing this concept in Computer 
Go field, but Mogo definitely was not.

I was in Amstertam attending Computer Olympiad 2007, in the team of Chinese 
chess program “Deep Elephant”. I played with Crazy Stone, Mogo and was very 
surprised to see they beat me. Afterwards, Mogo’s paper is so easy to 
understand/implement for me that trigger me to work on Computer Go. Indeed, 
Mogo has huge contributions, especially in the popularization of MCTS. I don’t 
mean to weaken or deny it, but just want to point out Crazy Stone’s great 
contributions. In Erica, I use CrazyStone-like simulations successfully. 
Mogo-type simulation almost does not help Erica at all. 

If we want to numerate the strongest programs, we cannot forget Fuego(2010 
UEC Cup winner) and MyGoFriend(Computer Olympiad 2010, 9x9 winner). For 
academic progress, we cannot forget Crazy Stone. For practical development 
usage, we cannot forget GnuGo and GoGui released by Fuego team. There were 
really too many contributors in the past.

Happy New Years to all.

Aja

From: Fuming Wang 
Sent: Friday, December 31, 2010 10:16 AM
To: Aja ; computer-go@dvandva.org 
Subject: Re: [Computer-go] News on Tromp-Cook ?

This is certainly a good time to sit back and look at what got us here. The 
following key ideas have been mentioned so far: UCB, MCTS, RAVE, Pattern and Go 
knowledge during MC simulation.These ideas are all essential to a strong MC 
based Go program.If we want to pick the most important idea that got us here, I 
would say it is the realization that adding Go Pattern and Go Knowledge to MC 
simulation can significantly improve the quality of board evaluation. This is 
amount to the important sampling concept in MC integration, which is very 
import for Monte Carlo applications in many fields. MC simulation with 
importance sampling give us for the first time a reasonablly accurate 
evaluation function for Go. UCB, MCTS, RAVE are certainly very important, 
however, it is still possible that new approaches that can achieve good results 
with just importantly samples MC simulation. So, I think MoGo is the most 
important break-through.

Happy New Year, everyone!
Fuming




On Fri, Dec 31, 2010 at 9:20 AM, Aja ajahu...@gmail.com wrote:

  Hi Jeff,

  When, do you think, did Mogo started dominating all the KGS computer 
events and CGOS, and also was the first to extend that dominance from 9x9 to 
19x19.?

  In Computer Olympiad 2007, Steenvreter was gold medal on 9x9. At the 
final match of 19x19, it's easily to see that Mogo and Crazy Stone were close 
(finally Mogo 1st and CS 2ed). But, at the end of 2007, Crazy Stone defeated 
Mogo and won the UEC Cup (19x19). Afterwards, Many Faces won 9x9 and 19x19 on 
2008. Zen and Erica won 2009 and 2010, both continuing Crazy