Re: [computer-go] Monte-Carlo Go simulation

2007-02-12 Thread David Doshay

On 9, Feb 2007, at 4:40 AM, Sylvain Gelly wrote:

Alain's point, that knowledge can both help narrow the search to  
"good"

moves and at the same time steer you away from the best move is
absolutely true in SlugGo's case.

I completely agree with that.
However can we agree that we want a better player in a whole, and not
only better in some particular positions? So perhaps, I think, behind
far from the best move, while playing always good moves is already
good no?

Sylvain


Absolutely. I notice that when SlugGo makes moves that a professional
said "look quite playable" a huge mistake is going to happen very soon.

Making the best move from the point of view of a really strong player
is also NOT what I want SlugGo to do. SlugGo has no concept of what
the implications are and what the required followup moves will be.

It is clearly better for SlugGo to make many 90% moves in a row than to
have it try to make any 100% moves. There is a much better chance of
it finding the correct followup to slightly lower quality moves.



Cheers,
David






___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Monte-Carlo Go simulation

2007-02-11 Thread Sylvain Gelly

Alain's point, that knowledge can both help narrow the search to "good"
moves and at the same time steer you away from the best move is
absolutely true in SlugGo's case.

I completely agree with that.
However can we agree that we want a better player in a whole, and not
only better in some particular positions? So perhaps, I think, behind
far from the best move, while playing always good moves is already
good no?

Sylvain
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Monte-Carlo Go simulation

2007-02-09 Thread Sylvain Gelly

> I think we have to start defining what the bias. For me the bias is
> the difference between the expected value of the outcomes of playouts
> by the simulation player and the "real minimax value". In this
> definition the uniform random simulation player is VERY biased and
> gnugo much less.
OK, by i used "bias" in common sense, to mean that the "strong simulator" has
preferences for some moves, and doesn't consider them equally,
or worse doesn't consider some moves.

Ok you are talking about "bias" on the moves, I was talking about bias
on the Monte-Carlo simulations outcomes (difference between the
expectation of the random variable and the real value you want to
estimate). So you are more talking about difference with the uniform
distribution on moves. I think what we care about is the MC outcomes.
Particular moves played by the simulation player do not matter.


So it will miss some good points due to
its knowledge, whereas the random player will find the move.

But we don't care about the random player "finding" or not the move.
If the random player plays with probability 1/100 the good move, and
also does not find the good answers afterwards, it is not clear how it
changes the expectation of the outcomes.


> > Even if it is obviously much stronger than a random player, it would give
> > wrong result if used as a simulation player.
> Hum, are you sure?
I m 100% sure of this :-)


May I be 99% sure that you should not be 100% sure of this? ;-)
I think that without having empirical evidences, we can't be 100% sure...



> I think that GnuGo with randomisation, (and much
> faster of course) would make a very good simulation player (much
> better than any existing simulation player).
Even with randomization, GNU Go considers only a few dozen of possible moves,
and makes systematic errors.

You can be epsilon greedy if you want to avoid systematic errors.


Some times ago Rémi Coulom asked for "positions
illustrating computer stupidity" (2006-11-22)
http://computer-go.org/pipermail/computer-go/2006-November/007107.html
and GNU Go provided some nice examples where its (wrong/misunderstood) 
>knowledge induces a failure in play.

I bet we can find much more positions where uniform random would give
the wrong answer with high probabilty, isn't it?

Furthermore, we are not talking about having a perfect player. There
will always have particular positions where computer sucks, as I am
sure we can find positions where human sucks.


> I understand all these counter examples, I just think that it is more
> complicated than that.
I fully agree.


Good, we only have to find how to do better then ;-).

Sylvain
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Monte-Carlo Go simulation

2007-02-09 Thread alain Baeckeroot
Le jeudi 8 février 2007 22:09, Sylvain Gelly a écrit :
> > It seems i was ambiguous: I was speaking of the simulation player too.
> > What i meant is a random simulation player is not biased, whereas a "better"
> > simulation player is biased by its knowledge, and thus can give wrong
> > evaluation of a position.
> I think we have to start defining what the bias. For me the bias is
> the difference between the expected value of the outcomes of playouts
> by the simulation player and the "real minimax value". In this
> definition the uniform random simulation player is VERY biased and
> gnugo much less.
OK, by i used "bias" in common sense, to mean that the "strong simulator" has
preferences for some moves, and doesn't consider them equally, 
or worse doesn't consider some moves. So it will miss some good points due to
its knowledge, whereas the random player will find the move.

> 
> > A trivial example is GNU Go: its analyze is "sometimes" wrong.
> Of course, if not computer go would be solved :-).
> 
> > Even if it is obviously much stronger than a random player, it would give
> > wrong result if used as a simulation player. 
> Hum, are you sure?
I m 100% sure of this :-) 
 
> I think that GnuGo with randomisation, (and much 
> faster of course) would make a very good simulation player (much
> better than any existing simulation player).
Even with randomization, GNU Go considers only a few dozen of possible moves,
and makes systematic errors. Some times ago Rémi Coulom asked for "positions
illustrating computer stupidity" (2006-11-22) 
http://computer-go.org/pipermail/computer-go/2006-November/007107.html
and GNU Go provided some nice examples where its (wrong/misunderstood) knowledge
induces a failure in play. One very impressive was GNU GO 3.6 not invading
where obviously it is possible to invade (Steven Clark 2006-11-27)
http://computer-go.org/pipermail/computer-go/2006-November/007184.html

> But a weaker player than GnuGo can make an even better simulation player.
yes.
> 
> > David Doshay experiments with SlugGo showed that
> > searching very deep/wide does not improve a lot the strength of the engine,
> > which is bound by the underlying weaknesses of GNU Go.
> Yes, this a similar non trivial result. I think there are more
> existing experimental and theoritical analysis of this, though.
> Perhaps such an analysis already exist for MC also, it is just that I
> don't know.
> 
> > Or maybe i just understood nothing of what you explained ;)
> It was not really "explanations", just thoughts. I have no the
> solution, just think that it is an interesting question, and that it
> may be discussed. May be from a strong explanation of this phenomenon
> could come new ideas.
> 
> I understand all these counter examples, I just think that it is more
> complicated than that.
> 
> Sylvain

I fully agree.
Alain
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Monte-Carlo Go simulation

2007-02-08 Thread David Doshay
I think that the bias Alain meant is the choice of moves that control  
the

branching factor. If I understand correctly, this can happen differently
in two places in MoGo: once in the branching below a node in the UCT
tree, and either the same or differently in the random playouts.

In some ways this is like SlugGo, where branching at the first level
(our move) may or may not be determined in the same way as at
the next (their guessed reply). If SlugGo is set for multiple levels of
branching in the lookahead we do the same thing but from the other
perspective. But for deeper linear lookahead things are different, just
like in your random playouts.

Alain's point, that knowledge can both help narrow the search to "good"
moves and at the same time steer you away from the best move is
absolutely true in SlugGo's case. This is the primary reason we have
always wanted to have multiple Go engines making move suggestions,
not just multiple instantiations of the same engine like we have now.

But we could get up and running faster with one engine, so that is
where we are now. Hopefully not much longer ...

Cheers,
David



On 8, Feb 2007, at 2:09 PM, Sylvain Gelly wrote:

It seems i was ambiguous: I was speaking of the simulation player  
too.
What i meant is a random simulation player is not biased, whereas  
a "better"

simulation player is biased by its knowledge, and thus can give wrong
evaluation of a position.

I think we have to start defining what the bias. For me the bias is
the difference between the expected value of the outcomes of playouts
by the simulation player and the "real minimax value". In this
definition the uniform random simulation player is VERY biased and
gnugo much less.


A trivial example is GNU Go: its analyze is "sometimes" wrong.

Of course, if not computer go would be solved :-).

Even if it is obviously much stronger than a random player, it  
would give wrong >result if used as a simulation player.

Hum, are you sure? I think that GnuGo with randomisation, (and much
faster of course) would make a very good simulation player (much
better than any existing simulation player). But a weaker player than
GnuGo can make an even better simulation player.


David Doshay experiments with SlugGo showed that
searching very deep/wide does not improve a lot the strength of  
the engine,

which is bound by the underlying weaknesses of GNU Go.

Yes, this a similar non trivial result. I think there are more
existing experimental and theoritical analysis of this, though.
Perhaps such an analysis already exist for MC also, it is just that I
don't know.


Or maybe i just understood nothing of what you explained ;)

It was not really "explanations", just thoughts. I have no the
solution, just think that it is an interesting question, and that it
may be discussed. May be from a strong explanation of this phenomenon
could come new ideas.

I understand all these counter examples, I just think that it is more
complicated than that.

Sylvain
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Monte-Carlo Go simulation

2007-02-08 Thread Sylvain Gelly

It seems i was ambiguous: I was speaking of the simulation player too.
What i meant is a random simulation player is not biased, whereas a "better"
simulation player is biased by its knowledge, and thus can give wrong
evaluation of a position.

I think we have to start defining what the bias. For me the bias is
the difference between the expected value of the outcomes of playouts
by the simulation player and the "real minimax value". In this
definition the uniform random simulation player is VERY biased and
gnugo much less.


A trivial example is GNU Go: its analyze is "sometimes" wrong.

Of course, if not computer go would be solved :-).


Even if it is obviously much stronger than a random player, it would give wrong 
>result if used as a simulation player.

Hum, are you sure? I think that GnuGo with randomisation, (and much
faster of course) would make a very good simulation player (much
better than any existing simulation player). But a weaker player than
GnuGo can make an even better simulation player.


David Doshay experiments with SlugGo showed that
searching very deep/wide does not improve a lot the strength of the engine,
which is bound by the underlying weaknesses of GNU Go.

Yes, this a similar non trivial result. I think there are more
existing experimental and theoritical analysis of this, though.
Perhaps such an analysis already exist for MC also, it is just that I
don't know.


Or maybe i just understood nothing of what you explained ;)

It was not really "explanations", just thoughts. I have no the
solution, just think that it is an interesting question, and that it
may be discussed. May be from a strong explanation of this phenomenon
could come new ideas.

I understand all these counter examples, I just think that it is more
complicated than that.

Sylvain
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Monte-Carlo Go simulation

2007-02-08 Thread alain Baeckeroot
Le jeudi 8 février 2007 20:12, Sylvain Gelly a écrit :
> > One simple explaination could be that a random player shamelessly tries 
> > "all"
> > moves (very bad ones but also very nice tesuji) whereas the "stronger" 
> > player
> > is restricted by its knowledge and will always miss some kind of moves.
> 
> Here we are not speeking about the pruning in the tree, but the
> simulation player. The tree must explore every move, to avoid missing
> important ones. However we totally don't care if all possible games
> can or not be played by the simulation player. What we care about is
> the expectation of the wins by self play.
> If the simulation player sometimes play meaningful sequences but with
> a very small probability, then it has very little influence on the
> expectation.
> 

It seems i was ambiguous: I was speaking of the simulation player too.
What i meant is a random simulation player is not biased, whereas a "better" 
simulation player is biased by its knowledge, and thus can give wrong
evaluation of a position.

A trivial example is GNU Go: its analyze is "sometimes" wrong. Even if it
is obviously much stronger than a random player, it would give wrong result if
used as a simulation player. David Doshay experiments with SlugGo showed that
searching very deep/wide does not improve a lot the strength of the engine,
which is bound by the underlying weaknesses of GNU Go.

Or maybe i just understood nothing of what you explained ;)
Alain
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Monte-Carlo Go simulation

2007-02-08 Thread Sylvain Gelly

One simple explaination could be that a random player shamelessly tries "all"
moves (very bad ones but also very nice tesuji) whereas the "stronger" player
is restricted by its knowledge and will always miss some kind of moves.


Here we are not speeking about the pruning in the tree, but the
simulation player. The tree must explore every move, to avoid missing
important ones. However we totally don't care if all possible games
can or not be played by the simulation player. What we care about is
the expectation of the wins by self play.
If the simulation player sometimes play meaningful sequences but with
a very small probability, then it has very little influence on the
expectation.

It seems to me that the explanation may be more complicated.

Sylvain
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Monte-Carlo Go simulation

2007-02-08 Thread alain Baeckeroot
Le jeudi 8 février 2007 17:06, Sylvain Gelly a écrit :
> Hello,
> 
> > Is there any known (by theory or tests) function of how much a increase
> > in the strength of the simulation policy increases the strength of the
> > MC/UCT Program as a whole?
> 
> I think that is a very interesting question.
> In our work on MoGo we found that there could be a decrease of the
> strength of the MC/UCT program while using a stronger simulation
> policy. It is why in MoGo it is more the "sequence idea", than the
> "strength idea". Our best simulation policy is quite weak compared to
> others we tested.
> But we have further experiments, in a work with David Silver from the
> university of Alberta. We found out that the relation "strong
> simulation policy" <=> "strong MC program" is wrong at a much larger
> scale. So the "intransivity" is true even with much much stronger
> simulation policies.
> 
One simple explaination could be that a random player shamelessly tries "all"
moves (very bad ones but also very nice tesuji) whereas the "stronger" player
is restricted by its knowledge and will always miss some kind of moves.

Things alike were reported by D Doshay and Sluggo, which is limited by the 
underlying gnugo, no matter how deep/wide it searches.

Alain
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Monte-Carlo Go simulation

2007-02-08 Thread alain Baeckeroot
Le jeudi 8 février 2007 17:06, Sylvain Gelly a écrit :
> Hello,
> 
> > Is there any known (by theory or tests) function of how much a increase
> > in the strength of the simulation policy increases the strength of the
> > MC/UCT Program as a whole?
> 
> I think that is a very interesting question.
> In our work on MoGo we found that there could be a decrease of the
> strength of the MC/UCT program while using a stronger simulation
> policy. It is why in MoGo it is more the "sequence idea", than the
> "strength idea". Our best simulation policy is quite weak compared to
> others we tested.
> But we have further experiments, in a work with David Silver from the
> university of Alberta. We found out that the relation "strong
> simulation policy" <=> "strong MC program" is wrong at a much larger
> scale. So the "intransivity" is true even with much much stronger
> simulation policies.
> 
One simple explaination could be that a random player shamelessly tries "all"
moves (very bad ones but also very nice tesuji) whereas the "stronger" player
is restricted by its knowledge and will always miss some kind of moves.

Things alike were reported by D Doshay and Sluggo, which is limited by the 
underlying gnugo, no matter how deep/wide it searches.

Alain
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/