subject:"\"\\\[computer\\\-go\\\] Scoring \\\- step function or sigmoid function\\\?\""

Re: [computer-go] Scoring - step function or sigmoid function?

2009-07-08 Thread Álvaro Begué

You can also test against weaker programs with compensating handicap.
It might not be quite as interesting, but it's easier to test.


2009/7/8 terry mcintyre :
> To properly test any method of playing with a handicap, today's programs
> will need to play against much stronger opponents. Self-play, or play
> against other roughly-equal programs, won't test the ability to eke out a
> win against a professional go player.
>
> Terry McIntyre 
>
> “We hang the petty thieves and appoint the great ones to public office.” --
> Aesop
>
>
>
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Scoring - step function or sigmoid function?

2009-07-08 Thread terry mcintyre

To properly test any method of playing with a handicap, today's programs will 
need to play against much stronger opponents. Self-play, or play against other 
roughly-equal programs, won't test the ability to eke out a win against a 
professional go player.

Terry McIntyre 



“We hang the petty thieves and appoint the great ones to public office.” -- 
Aesop




  ___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Scoring - step function or sigmoid function?

2009-07-08 Thread dhillismail


Since this topic has resurfaced, I'll mention again the alternative strategy of 
using unbalanced playout rules to compensate for high handicaps. As Don pointed 
out, the existence of a high handicap *should* indicate that black is more 
likely to make mistakes. This is simple to model, assuming heavy playouts, by 
adding a bit more randomness to black moves (only outside the tree). 


?Personally, I'm inclined to believe that unbalancing the playouts should be 
superior to adjusting the Komi. When black's external playout moves are more 
random, white will be more likely to win unsettled areas, but settled areas 
will tend to stay settled. Inside the tree, we want white searching for 
complicated board positions and black searching for simpler ones. 


In my program, it was easy to find an appropriate adjustment for different 
handicaps through offline testing. From a purely subjective viewpoint, I found 
that the resulting opening moves looked much more reasonable. People who have 
tried dynamically adjusting the Komi, report similar, subjective, success. 
There might not be any practical difference. 
It's not obvious to me what a fair test would be.


I'm convinced that either method is worth doing in the opening for very high 
handicaps. Just looking at some examples is pretty persuasive. I'm more 
doubtful about trying them late in an even game when one side has pulled ahead.

- Dave Hillis


-Original Message-
From: Don Dailey 
To: computer-go 
Sent: Wed, Jul 8, 2009 8:49 am
Subject: Re: [computer-go] Scoring - step function or sigmoid function?






On Mon, Jun 8, 2009 at 7:35 AM, Stefan Kaitschick 
 wrote:





Thinking about why... In a given board position moves can be grouped
into sets: the set of correct moves, the set of 1pt mistakes, 2pt
mistakes, etc. Let's assume each side has roughly the same number of
moves each in each of these groupings.

If black is winning by 0.5pt with perfect play, then mistakes by each
side balance out and we get a winning percentage of just over 50%. If he
is winning by 1.5pt then he has breathing space and can make an extra
mistake. Or in other words, at a certain move he can play any of the
moves in the "correct moves" set, or any of the moves in the "1pt
mistakes" set, and still win. So he wins more of the playouts. Say 55%.
If he is winning by 2.5pts then he can make one 2pt mistakes or two 1pt
mistakes (more than the opponent) and still win, so he wins more
playouts, 60% perhaps. And so on.

My conclusion was that the winning percentage is more than just an
estimate of how likely the player is to win. It is in fact a crude
estimator of the final score.

Going back to your original comment, when choosing between move A that
leads to a 0.5pt win, and move B that leads to a 100pt win, you should
be seeing move B has a higher winning percentage.

Darren




Point well taken.Winning positions tend to cluster and critical swing moves are 
rare, statistically speaking.
If the position is more or less evenly balanced, the step function might 
allready be very close to optimal because of this.
But I would like to bring up a well known mc quirk: In handicap positions, or 
after one side scored a big success in an even game,
bots play badly with both sides, until the position becomes closer again. The 
problem here is that every move is a win (or every move is a loss).
On 9*9, its possible to beat a bot, giving it 2 stones, even when it's a close 
contest on even with komi. All it needs is a single bot missread at the moment 
the position becomes close(which it will, because the bot will be "lazy" until 
that point).


I would say it's foolish to purposely give the bot 2 stones in order to hope 
for a misread unless you are expert on that particular behaviour and can 
predict just where and why it will go wrong.? 

?


So it would be desirable for the bot to make keeping the score advantage large 
an auxiliary goal.
This has been tried ofcourse, but without much success sofar.
And it seems that the main reason is that tinkering with the scoring function 
to achive this, tends to worsen play in competitive situations.


This is easy to understand - it's because maximizing your winning chances is a 
better strategy than maximizing how many points you take.?? One is the actual 
goal of the game (to win) and the other is a different goal which is not as 
highly corelated as we like to think it is.?? 
?


I have an alternative suggestion: In handicap games, introduce a virtual komi, 
that gets reduced to 0 as the game progresses.
This would work for the bot on both sides: If the bot has b it will make less 
lazy plays, if it has w, it will be less maniacal.
For example, in a 4 stone 19*19 game, if the real starting advantage is about 
45 points, the bot could introduce an internal komi of about 30-35.
The bot should be optimistic with b and pessimistic with w, but not to the 
point th

Re: [computer-go] Scoring - step function or sigmoid function?

2009-07-08 Thread Don Dailey

On Mon, Jun 8, 2009 at 7:35 AM, Stefan Kaitschick <
stefan.kaitsch...@hamburg.de> wrote:

>
>
>  Thinking about why... In a given board position moves can be grouped
>> into sets: the set of correct moves, the set of 1pt mistakes, 2pt
>> mistakes, etc. Let's assume each side has roughly the same number of
>> moves each in each of these groupings.
>>
>> If black is winning by 0.5pt with perfect play, then mistakes by each
>> side balance out and we get a winning percentage of just over 50%. If he
>> is winning by 1.5pt then he has breathing space and can make an extra
>> mistake. Or in other words, at a certain move he can play any of the
>> moves in the "correct moves" set, or any of the moves in the "1pt
>> mistakes" set, and still win. So he wins more of the playouts. Say 55%.
>> If he is winning by 2.5pts then he can make one 2pt mistakes or two 1pt
>> mistakes (more than the opponent) and still win, so he wins more
>> playouts, 60% perhaps. And so on.
>>
>> My conclusion was that the winning percentage is more than just an
>> estimate of how likely the player is to win. It is in fact a crude
>> estimator of the final score.
>>
>> Going back to your original comment, when choosing between move A that
>> leads to a 0.5pt win, and move B that leads to a 100pt win, you should
>> be seeing move B has a higher winning percentage.
>>
>> Darren
>>
>>
> Point well taken.Winning positions tend to cluster and critical swing moves
> are rare, statistically speaking.
> If the position is more or less evenly balanced, the step function might
> allready be very close to optimal because of this.
> But I would like to bring up a well known mc quirk: In handicap positions,
> or after one side scored a big success in an even game,
> bots play badly with both sides, until the position becomes closer again.
> The problem here is that every move is a win (or every move is a loss).
> On 9*9, its possible to beat a bot, giving it 2 stones, even when it's a
> close contest on even with komi. All it needs is a single bot missread at
> the moment the position becomes close(which it will, because the bot will be
> "lazy" until that point).


I would say it's foolish to purposely give the bot 2 stones in order to hope
for a misread unless you are expert on that particular behaviour and can
predict just where and why it will go wrong.



>
> So it would be desirable for the bot to make keeping the score advantage
> large an auxiliary goal.
> This has been tried ofcourse, but without much success sofar.
> And it seems that the main reason is that tinkering with the scoring
> function to achive this, tends to worsen play in competitive situations.


This is easy to understand - it's because maximizing your winning chances is
a better strategy than maximizing how many points you take.   One is the
actual goal of the game (to win) and the other is a different goal which is
not as highly corelated as we like to think it is.


>
> I have an alternative suggestion: In handicap games, introduce a virtual
> komi, that gets reduced to 0 as the game progresses.
> This would work for the bot on both sides: If the bot has b it will make
> less lazy plays, if it has w, it will be less maniacal.
> For example, in a 4 stone 19*19 game, if the real starting advantage is
> about 45 points, the bot could introduce an internal komi of about 30-35.
> The bot should be optimistic with b and pessimistic with w, but not to the
> point that every move evaluates to the same value, and move selection
> becomes a toss-up. Another way to look at this, is that humans that give a
> handicap know that they can't usually catch up in one piece.
> And humans that take a handicap know that they can't give up their
> advantage too quickly.
> Virtual komi encodes this simple knowledge.
> During the course of the game this internal komi would ofcourse have to be
> reduced to 0.
> The proper criteria can only be found by experimentation, but the important
> factors will be how far the game has progressed, and what the win rate is
> for the best move. If the bot becomes pessimistic with b it should lower the
> internal komi more quickly.


In principle this is no different from the usual schemes applied when there
is no handicap.   In practice, there is one thing different that could make
it at least worth a look.When you play WITH a handicap it's because your
opponent is weaker than you are.When the opponent has the handicap it's
because YOU are the weaker player.So you can use the fact that you are
playing in a handicap game to tell you something about your opponent.

Now if you are playing against a weaker opponent,  your winning chances
actually do increase, so by manipulation of the komi you can represent that
fact.It's certainly wrong to do this with an equal opponent but perhaps
not so bad with a weaker opponent.

My guess is that this still won't work, but at least there is something
different about these kind of games that could make this worth an

Re: [computer-go] Scoring - step function or sigmoid function?

2009-07-08 Thread Stefan Kaitschick





Thinking about why... In a given board position moves can be grouped
into sets: the set of correct moves, the set of 1pt mistakes, 2pt
mistakes, etc. Let's assume each side has roughly the same number of
moves each in each of these groupings.

If black is winning by 0.5pt with perfect play, then mistakes by each
side balance out and we get a winning percentage of just over 50%. If he
is winning by 1.5pt then he has breathing space and can make an extra
mistake. Or in other words, at a certain move he can play any of the
moves in the "correct moves" set, or any of the moves in the "1pt
mistakes" set, and still win. So he wins more of the playouts. Say 55%.
If he is winning by 2.5pts then he can make one 2pt mistakes or two 1pt
mistakes (more than the opponent) and still win, so he wins more
playouts, 60% perhaps. And so on.

My conclusion was that the winning percentage is more than just an
estimate of how likely the player is to win. It is in fact a crude
estimator of the final score.

Going back to your original comment, when choosing between move A that
leads to a 0.5pt win, and move B that leads to a 100pt win, you should
be seeing move B has a higher winning percentage.

Darren



Point well taken.Winning positions tend to cluster and critical swing moves 
are rare, statistically speaking.
If the position is more or less evenly balanced, the step function might 
allready be very close to optimal because of this.
But I would like to bring up a well known mc quirk: In handicap positions, 
or after one side scored a big success in an even game,
bots play badly with both sides, until the position becomes closer again. 
The problem here is that every move is a win (or every move is a loss).
On 9*9, its possible to beat a bot, giving it 2 stones, even when it's a 
close contest on even with komi. All it needs is a single bot missread at 
the moment the position becomes close(which it will, because the bot will be 
"lazy" until that point).
So it would be desirable for the bot to make keeping the score advantage 
large an auxiliary goal.

This has been tried ofcourse, but without much success sofar.
And it seems that the main reason is that tinkering with the scoring 
function to achive this, tends to worsen play in competitive situations.
I have an alternative suggestion: In handicap games, introduce a virtual 
komi, that gets reduced to 0 as the game progresses.
This would work for the bot on both sides: If the bot has b it will make 
less lazy plays, if it has w, it will be less maniacal.
For example, in a 4 stone 19*19 game, if the real starting advantage is 
about 45 points, the bot could introduce an internal komi of about 30-35.
The bot should be optimistic with b and pessimistic with w, but not to the 
point that every move evaluates to the same value, and move selection 
becomes a toss-up. Another way to look at this, is that humans that give a 
handicap know that they can't usually catch up in one piece.
And humans that take a handicap know that they can't give up their advantage 
too quickly.

Virtual komi encodes this simple knowledge.
During the course of the game this internal komi would ofcourse have to be 
reduced to 0.
The proper criteria can only be found by experimentation, but the important 
factors will be how far the game has progressed, and what the win rate is 
for the best move. If the bot becomes pessimistic with b it should lower the 
internal komi more quickly.


One advantage of this approach is that it doesn't mess up even game play.
A more elaborate scheme would be to make a "komi search" before the real 
search - to find the best ratio of win rate to internal komi before making 
the normal move search with this komi. This could also be useful in even 
play after one side pulled ahead.


Stefan





___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Scoring - step function or sigmoid function?

2009-07-07 Thread Darren Cook

> MC bots play handicap games poorly.
> ...
> So my suggestion would be to pretend that there is a komi that almost
> compensates the handicap.

The same ideas has been suggested before [1]. The counter comments were
mostly that "hard to get right" or "tried that briefly and it didn't
work well".

Darren

[1]: I think starting here, and then the dozen or so followup messages.
http://computer-go.org/pipermail/computer-go/2008-August/015859.html


-- 
Darren Cook, Software Researcher/Developer
http://dcook.org/gobet/  (Shodan Go Bet - who will win?)
http://dcook.org/mlsn/ (Multilingual open source semantic network)
http://dcook.org/work/ (About me and my work)
http://dcook.org/blogs.html (My blogs and articles)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Scoring - step function or sigmoid function?

2009-07-07 Thread Stefan Kaitschick


My conclusion was that the winning percentage is more than just an
estimate of how likely the player is to win. It is in fact a crude
estimator of the final score.

Going back to your original comment, when choosing between move A that
leads to a 0.5pt win, and move B that leads to a 100pt win, you should
be seeing move B has a higher winning percentage.

Darren


Good point. Wins will occur in clusters, so win-rate and score go hand in 
hand.
MC algorithms seem to be treacherous ground when it comes to anticipating 
consequences.

Here's another shot:

MC bots play handicap games poorly.
When they take w they go into maniac mode because they can't find any wins.
When they take b they dither around until the game gets close.
What is entirely missing are the concepts of catching up or of preserving a 
lead.
So my suggestion would be to pretend that there is a komi that almost 
compensates the handicap.

A numerical example: a 4 stone handicap is worth about 45 points.
Start with a "virtual komi" of about 35 points, and decrease that value to 0 
during the course of the game.
This works in both directions. If the bot has w he will be pessimistic, but 
not suicidal.

If he has b he will be optimistic, but not so lazy.
The rate of "giving rope" should depend on the number of moves played and on 
the winning percentage.

If the winning percentage drops early, reduce the virtual komi more sharply.
One advantage of this approach would be that it wouldn't tinker with even 
game stategies.
A more elaborate scheme would be to make several preliminary searches at 
different komi levels.
The goal would not be to find the best move. The search would only  try to 
find the win rate for the komi.
Depending on how far the game progressed, giving a virtual komi will be 
worth some win rate reduction.
(Or taking a virtual komi will increase the winrate, making the bot play 
less maniacal than playing a string of kothreats in an unfavorable position.
After a decision on the komi is reached, a search is done for this komi to 
find the move to be played.

Towards the end of the game the virtual komi must allways be 0.
This strategy might even be useful for even games, when the winrate strongly 
favors one side before the late endgame.
That  would revert to the handicap game situation.(Except that a disparity 
of strength is not presumed)


Stefan




___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Scoring - step function or sigmoid function?

2009-06-30 Thread Christian Nentwich


Darren,

this sounds like a good insight, but only if a very large number of 
playouts have been performed. By contrast, the original poster writes:


 But in the opening, where the scoring
leaves are 300 moves away from the root, surely a putative half point
win doesn't translate to a significant advantage, where as a 100

This I don't buy. If the scoring leaves are 300 moves away, any random 
playout is way too unreliable to take the score into account. You might 
as well generate a score randomly. It could be a 100 point win on the 
first and 100 point loss on the second. In that case, it will be much 
safer to use Fuego's approach of slightly modifying the playout score 
from [0.0,1.0] to [0.0+s,1.0-s] where s depends on the size of the win 
relative to the board size.


It is also worth bearing in mind - again, only if the state space was 
only very superficially searched - that winning by large margins can 
entail taking large risks. Human players do that only when behind and 
otherwise actively seek the safer route.


Christian


On 01/07/2009 04:23, Darren Cook wrote:

It seems to be surprisingly difficult to outperform the step function
  when it comes to mc scoring. I know that many surprises await the mc
adventurer, but completely discarding the final margin of victory
just can't be optimal. ...
an mc program, holding on to a half point victory in the endgame,  is
a thing of beauty and terror. But in the opening, where the scoring
leaves are 300 moves away from the root, surely a putative half point
win doesn't translate to a significant advantage, where as a 100
point win would.
 


I had a breakthrough in my understanding of why it is "surprisingly
difficult to outperform the step function" when analyzing some 9x9 games
with Mogo and ManyFaces. Let's see if I can extract that insight into
words...

I observed that in many situations I could map the winning percentage to
the final score. E.g.
   50-55%: 0.5pt
   55-60%: 1.5pt
   60-65%: 2.5pt
   etc.

It wasn't as clear cut as that. In fact what I was actually noticing was
if I made a 1pt error the winning percentage for the opponent often
jumped by, say, 5%.

Thinking about why... In a given board position moves can be grouped
into sets: the set of correct moves, the set of 1pt mistakes, 2pt
mistakes, etc. Let's assume each side has roughly the same number of
moves each in each of these groupings.

If black is winning by 0.5pt with perfect play, then mistakes by each
side balance out and we get a winning percentage of just over 50%. If he
is winning by 1.5pt then he has breathing space and can make an extra
mistake. Or in other words, at a certain move he can play any of the
moves in the "correct moves" set, or any of the moves in the "1pt
mistakes" set, and still win. So he wins more of the playouts. Say 55%.
If he is winning by 2.5pts then he can make one 2pt mistakes or two 1pt
mistakes (more than the opponent) and still win, so he wins more
playouts, 60% perhaps. And so on.

My conclusion was that the winning percentage is more than just an
estimate of how likely the player is to win. It is in fact a crude
estimator of the final score.

Going back to your original comment, when choosing between move A that
leads to a 0.5pt win, and move B that leads to a 100pt win, you should
be seeing move B has a higher winning percentage.

Darren

   


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Scoring - step function or sigmoid function?

2009-06-30 Thread Darren Cook

> It seems to be surprisingly difficult to outperform the step function
>  when it comes to mc scoring. I know that many surprises await the mc
> adventurer, but completely discarding the final margin of victory
> just can't be optimal. ...
> an mc program, holding on to a half point victory in the endgame,  is
> a thing of beauty and terror. But in the opening, where the scoring
> leaves are 300 moves away from the root, surely a putative half point
> win doesn't translate to a significant advantage, where as a 100
> point win would.

I had a breakthrough in my understanding of why it is "surprisingly
difficult to outperform the step function" when analyzing some 9x9 games
with Mogo and ManyFaces. Let's see if I can extract that insight into
words...

I observed that in many situations I could map the winning percentage to
the final score. E.g.
  50-55%: 0.5pt
  55-60%: 1.5pt
  60-65%: 2.5pt
  etc.

It wasn't as clear cut as that. In fact what I was actually noticing was
if I made a 1pt error the winning percentage for the opponent often
jumped by, say, 5%.

Thinking about why... In a given board position moves can be grouped
into sets: the set of correct moves, the set of 1pt mistakes, 2pt
mistakes, etc. Let's assume each side has roughly the same number of
moves each in each of these groupings.

If black is winning by 0.5pt with perfect play, then mistakes by each
side balance out and we get a winning percentage of just over 50%. If he
is winning by 1.5pt then he has breathing space and can make an extra
mistake. Or in other words, at a certain move he can play any of the
moves in the "correct moves" set, or any of the moves in the "1pt
mistakes" set, and still win. So he wins more of the playouts. Say 55%.
If he is winning by 2.5pts then he can make one 2pt mistakes or two 1pt
mistakes (more than the opponent) and still win, so he wins more
playouts, 60% perhaps. And so on.

My conclusion was that the winning percentage is more than just an
estimate of how likely the player is to win. It is in fact a crude
estimator of the final score.

Going back to your original comment, when choosing between move A that
leads to a 0.5pt win, and move B that leads to a 100pt win, you should
be seeing move B has a higher winning percentage.

Darren

-- 
Darren Cook, Software Researcher/Developer
http://dcook.org/gobet/  (Shodan Go Bet - who will win?)
http://dcook.org/mlsn/ (Multilingual open source semantic network)
http://dcook.org/work/ (About me and my work)
http://dcook.org/blogs.html (My blogs and articles)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

[computer-go] Scoring - step function or sigmoid function?

2009-06-30 Thread Stefan Kaitschick

It seems to be surprisingly difficult to outperform the step function  when 
it comes to mc scoring.
I know that many surprises await the mc adventurer, but completely 
discarding the final margin of victory just can't be optimal. The sigmoid 
function can be tinkered with ofcourse, by making its slopes steeper and/or 
by awarding bonus points for victory. But if it looks like the step function 
in the end, then computational resources could have been saved by just using 
the step function from the start. The power of the step function is that it 
directly awards what we are really interested in - victory. And an mc 
program, holding on to a half point victory in the endgame,  is a thing of 
beauty and terror. But in the opening, where the scoring leaves are 300 
moves away from the root, surely a putative half point win doesn't translate 
to a significant advantage, where as a 100 point win would. My suggestion is 
this: how about backing up the individual outcomes through the  tree and 
then do the evaluation at the intermediate nodes, using the sigmoid function 
and the parameters depending on the distance from the root? This might be 
too expensive computationaly, but shortcuts could be devised. For example, 
wins could be sorted into a couple of different categories( from half point 
win to landslide), and those categories could be evaluated differently, 
depending on the distance to the root.


Stefan 


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Scoring - step function or sigmoid function?

Re: [computer-go] Scoring - step function or sigmoid function?

Re: [computer-go] Scoring - step function or sigmoid function?

Re: [computer-go] Scoring - step function or sigmoid function?

Re: [computer-go] Scoring - step function or sigmoid function?

Re: [computer-go] Scoring - step function or sigmoid function?

Re: [computer-go] Scoring - step function or sigmoid function?

Re: [computer-go] Scoring - step function or sigmoid function?

Re: [computer-go] Scoring - step function or sigmoid function?

[computer-go] Scoring - step function or sigmoid function?

10 matches

Site Navigation

Mail list logo

Footer information