Re: [computer-go] Floating komi

2008-03-06 Thread Don Dailey


Christoph Birk wrote:
 On Mar 5, 2008, at 11:58 AM, Don Dailey wrote:
 Don Dailey wrote:
  not assuming that MC plays the best move.   The problem isn't the
 assumptions I am making, but the assumptions others are making,  that
 it's NOT playing the best move.You want to apply a fix to all
 positions without really knowing which positions are a problem.

 One last time: Nobody suggested a one fix for all positions/problems.
 The floating komi was suggested to guide the UCT search along
 certain lines of play during specific (close!) endgame positions.
When I said all positions I meant all games.You expect to apply this
to all winning and losing positions in every game, not just specific ones. 

- Don




 Christoph

 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Floating komi

2008-03-06 Thread steve uurtamo
why doesn't someone simply try this and post the results,
if they think that it would help?

s.


On Thu, Mar 6, 2008 at 8:30 AM, Don Dailey [EMAIL PROTECTED] wrote:


  Christoph Birk wrote:
   On Mar 5, 2008, at 11:58 AM, Don Dailey wrote:
   Don Dailey wrote:
not assuming that MC plays the best move.   The problem isn't the
   assumptions I am making, but the assumptions others are making,  that
   it's NOT playing the best move.You want to apply a fix to all
   positions without really knowing which positions are a problem.
  
   One last time: Nobody suggested a one fix for all positions/problems.
   The floating komi was suggested to guide the UCT search along
   certain lines of play during specific (close!) endgame positions.
  When I said all positions I meant all games.You expect to apply this
  to all winning and losing positions in every game, not just specific ones.

  - Don



  
   Christoph
  
   ___
   computer-go mailing list
   computer-go@computer-go.org
   http://www.computer-go.org/mailman/listinfo/computer-go/
  
  ___
  computer-go mailing list
  computer-go@computer-go.org
  http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Floating komi

2008-03-06 Thread Christoph Birk

On Thu, 6 Mar 2008, Don Dailey wrote:

One last time: Nobody suggested a one fix for all positions/problems.
The floating komi was suggested to guide the UCT search along
certain lines of play during specific (close!) endgame positions.

When I said all positions I meant all games.You expect to apply this
to all winning and losing positions in every game, not just specific ones.


No.

Christoph
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Floating komi

2008-03-06 Thread Matthew Woodcraft
Don Dailey wrote:
 I would be satisfied if someone implemented it,  reported a 500 game
 self-test sample and concluded that it didn't hurt the program
 measurably and show a few examples of how it improved the moves
 cosmetically,   perhaps even comparing both version with specific
 positions.

Remember that there's good reason to believe that this technique will be
less helpful in self-play.

-M-
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Floating komi

2008-03-05 Thread Don Dailey


Hideki Kato wrote:
 I'd like to give here an example to make things clear.

 The conditions are:
 1) Using digitizing scheme that maps real score to [0,1] (or [-1,1]) 
 so that the program cannot distinguish losing/winning by 0.5 or 10.5 
 pt at all.
 2) Playouts include some foolish moves (usually with low 
 but not zero probability), not to connect large groups in atari 
 position for example, due to hold its randomness.
 3) The position is at early endgame where there are no moves that 
 gain greater than 2 pt, for example, in perfect play.
 4) Black is behind by 0.5 pt.

 The playouts may return winning but gambling move (perhaps with low 
 probability) under above conditions, especialy in case of the number 
 of playouts is small which is usually true on 19x19, and UCT will 
 choose it.

 The question is, which is better to keep 0.5 pt behind or to play 
 gambling moves (here I mean such moves that B will lose many pts if W 
 will answer correctly) with expecting W's (stupid) mistakes?

   
The assumption is that you suddenly cannot trust MC to do what it does
best even though you did for the entire game up until this point.   MC
of course will choose the gambling move.  The whole concept of MC
is to do what is most likely to produce a win.  

We should think twice before asking it to choose the moves that produces
the more sure loss.We are the ones that have a bias about this, not
the MC programs.  


 In addition to above, there is one more issue to consider. If the 
 playout has a systematic error, nakade for example, it's not good to 
 keep 0.5 pt ahead.  Having more margin is clearly better.
   
I believe nakade is a strawman.There are lots of things MC does
better and lot's of things it does less well.You can always find
positions that are hard or easy for your program to solve, but it isn't
intrinsic to this issue. I don't think you should weaken this
concept of playing for the best winning chances for the very few
positions where MC programs take longer to resolve the endgame and there
is a slight chance that it will win if it just happens to be enough to
cover the exact situation.   Because this is no solution - it is at best
a patch and would only work in some cases.   If I could do something
that didn't hurt the program in other ways, but might help certain
positions once in a while,  I would go for it. I've been in game
programming a long time - if you have a problem with certain types of
positions you really want a pointed solution that has little or no
impact on other positions.   You don't want to be going back and forth
fixing things up but you want to solve the problem as correctly as
possible the first time.   I'll call this principle, every solution
has a side effect but this is a pretty bad side effect.(I can't
tell you how many times I fixed something in my chess program with
some evaluation change only to find that I broke many other things at
the same time.)

- Don



 The idea of floating komi helps above two.  I'd like to emphasize that 
 I know it's not a universal solution.  As it seems, however, very hard 
 to solve nakade problem, it could be a pracitical solution.

 -Hideki
 --
 [EMAIL PROTECTED] (Kato)
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

   
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Floating komi

2008-03-05 Thread Raymond Wold

Don Dailey wrote:


Hideki Kato wrote:

I'd like to give here an example to make things clear.

The conditions are:
1) Using digitizing scheme that maps real score to [0,1] (or [-1,1]) 
so that the program cannot distinguish losing/winning by 0.5 or 10.5 
pt at all.
2) Playouts include some foolish moves (usually with low 
but not zero probability), not to connect large groups in atari 
position for example, due to hold its randomness.
3) The position is at early endgame where there are no moves that 
gain greater than 2 pt, for example, in perfect play.

4) Black is behind by 0.5 pt.

The playouts may return winning but gambling move (perhaps with low 
probability) under above conditions, especialy in case of the number 
of playouts is small which is usually true on 19x19, and UCT will 
choose it.


The question is, which is better to keep 0.5 pt behind or to play 
gambling moves (here I mean such moves that B will lose many pts if W 
will answer correctly) with expecting W's (stupid) mistakes?


  

The assumption is that you suddenly cannot trust MC to do what it does
best even though you did for the entire game up until this point.   MC
of course will choose the gambling move.  The whole concept of MC
is to do what is most likely to produce a win.  


Not entirely, no. The concept of MC is to do what has most lines leading
to a win, which is slightly different. There's obviously a strong
correlation, or MC wouldn't work at all, but I think it's dangerous to
assume that MC by definition plays the best move. For one thing it makes
it very hard to argue about how to improve MC programs, it creates lots
of noise of don't do that, it will only make your program weaker.


We should think twice before asking it to choose the moves that produces
the more sure loss.We are the ones that have a bias about this, not
the MC programs.  



In addition to above, there is one more issue to consider. If the 
playout has a systematic error, nakade for example, it's not good to 
keep 0.5 pt ahead.  Having more margin is clearly better.
  

I believe nakade is a strawman.There are lots of things MC does
better and lot's of things it does less well.You can always find
positions that are hard or easy for your program to solve, but it isn't
intrinsic to this issue. I don't think you should weaken this
concept of playing for the best winning chances for the very few
positions where MC programs take longer to resolve the endgame and there
is a slight chance that it will win if it just happens to be enough to
cover the exact situation.   Because this is no solution - it is at best
a patch and would only work in some cases.


I don't think patching one thing at a time is such a bad way to write a
go program. Small steps, one at a time, and you suddenly have a much
stronger program. And again you're making the assumption that to deviate
from accurate MC means less winning chances. It might mean less winning
LINES, but the probability of a loss or win is entirely dependent of how
the opponent plays, which is (hopefully) never random. And this does not
mean you're doing opponent modeling, or - if you define opponent
modeling very loosely so it includes this - that what you're doing is bad.


If I could do something
that didn't hurt the program in other ways, but might help certain
positions once in a while,  I would go for it. 


I don't think you'll find ANY improvement to ANY non-trivial program
that doesn't, in some cases, make it play worse. What matters is how it
does in the average case.


I've been in game
programming a long time - if you have a problem with certain types of
positions you really want a pointed solution that has little or no
impact on other positions.   You don't want to be going back and forth
fixing things up but you want to solve the problem as correctly as
possible the first time.   I'll call this principle, every solution
has a side effect but this is a pretty bad side effect.(I can't
tell you how many times I fixed something in my chess program with
some evaluation change only to find that I broke many other things at
the same time.)


A good understanding of _why_ your program works can help this a lot,
ensuring that you know how to fix a problem without causing bigger
problems elsewhere. I think part of the problem here is that few (if
any) people know _why_ MC works. Why does the cumulative result of
random play-outs correlate so strongly with the strength of the
position? In what ways does it NOT correlate, that can be fixed?


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Floating komi

2008-03-05 Thread Don Dailey


Raymond Wold wrote:
 Don Dailey wrote:

 Hideki Kato wrote:
 I'd like to give here an example to make things clear.

 The conditions are:
 1) Using digitizing scheme that maps real score to [0,1] (or [-1,1])
 so that the program cannot distinguish losing/winning by 0.5 or 10.5
 pt at all.
 2) Playouts include some foolish moves (usually with low but not
 zero probability), not to connect large groups in atari position for
 example, due to hold its randomness.
 3) The position is at early endgame where there are no moves that
 gain greater than 2 pt, for example, in perfect play.
 4) Black is behind by 0.5 pt.

 The playouts may return winning but gambling move (perhaps with low
 probability) under above conditions, especialy in case of the number
 of playouts is small which is usually true on 19x19, and UCT will
 choose it.

 The question is, which is better to keep 0.5 pt behind or to play
 gambling moves (here I mean such moves that B will lose many pts if
 W will answer correctly) with expecting W's (stupid) mistakes?

   
 The assumption is that you suddenly cannot trust MC to do what it does
 best even though you did for the entire game up until this point.   MC
 of course will choose the gambling move.  The whole concept of MC
 is to do what is most likely to produce a win.  

 Not entirely, no. The concept of MC is to do what has most lines leading
 to a win, which is slightly different. There's obviously a strong
 correlation, or MC wouldn't work at all, but I think it's dangerous to
 assume that MC by definition plays the best move. For one thing it makes
 it very hard to argue about how to improve MC programs, it creates lots
 of noise of don't do that, it will only make your program weaker.
I'm not assuming that MC plays the best move.   The problem isn't the
assumptions I am making, but the assumptions others are making,  that
it's NOT playing the best move.You want to apply a fix to all
positions without really knowing which positions are a problem.   Now if
you want to talk about noise,  you have just added more noise if you do
that.   

I can give you an example of why you must be extremely careful about
making unfounded assumptions:

My naive simple MC program would often fail to make a needed but small
capture move because of noise in the play-outs.The move would be
seconds or third choice but not quite make it to the top - instead it
would play a speculative attack or something I considered rather
stupid.

My fix was to assume that this was a general problem,  that it didn't
know what it was doing - after all,  I could see with my own eyes that
this was a problem.  So I applied a general incentive to encourage
it to make capture moves.   

Guess what?   That was a mistake.   The program was actually pretty good
at not being greedy about captures - most of the time the capture can
wait and you essentially lose a stone every time you play a greedy
capture.   But now when it needed to do something useful it preferred to
waste time making a capture that could wait indefinitely (the group was
dead, the capture could be made later when there were no important moves
left.)   

I backed the change out in a hurry and realized that I would never get
it perfect unless I applied a very specific fix.I already knew that
from computer chess.   In one early program I had a bonus for rook on
the 7th rank, a good general principle.We played one of the top US
grandmasters in a casual game once and the computer put the rook on the
7th rank in a pawn-less endgame where the kings were already
centralized.The rook incentive got in the way of it finding the
right plan.   Although the rook on the 7th heuristic made the program
stronger in general,  it really wasn't implemented as well as it could
be. Every good general principle in chess and I'm sure Go too,  if
taken too seriously, has a side effect and will cause problems and the
solution is to try to address things as specifically as you can within
reason.

In my opinion,  your proposed fix for this imagined problem is
equivalent to if I had decided to reverse the rook to 7th bonus and turn
it into a penalty because I saw it go wrong once.  

In this case, my working assumption would be that monte carlo programs
know what they are doing (much more often than not) when they play risky
moves in losing positions. In other words, are you trying to fix
something that is not broken?  

I think there is a disconnect too between what appeals to the eye and
what actually works.   Most reasonable go players won't take risks when
the score is close because they think the game is about equal when it
probably isn't.Near the end of the game with Chinese scoring,  the
chances are rarely close to even - you just think they are because you
think that because the territory is about even the game could go
either way.   When I look at the logs of Lazarus,  the games are
virtually over well before the game actually ends.

Re: [computer-go] Floating komi

2008-03-05 Thread Hideki Kato
Don Dailey: [EMAIL PROTECTED]:


Hideki Kato wrote:
 I'd like to give here an example to make things clear.

 The conditions are:
 1) Using digitizing scheme that maps real score to [0,1] (or [-1,1]) 
 so that the program cannot distinguish losing/winning by 0.5 or 10.5 
 pt at all.
 2) Playouts include some foolish moves (usually with low 
 but not zero probability), not to connect large groups in atari 
 position for example, due to hold its randomness.
 3) The position is at early endgame where there are no moves that 
 gain greater than 2 pt, for example, in perfect play.
 4) Black is behind by 0.5 pt.

 The playouts may return winning but gambling move (perhaps with low 
 probability) under above conditions, especialy in case of the number 
 of playouts is small which is usually true on 19x19, and UCT will 
 choose it.

 The question is, which is better to keep 0.5 pt behind or to play 
 gambling moves (here I mean such moves that B will lose many pts if W 
 will answer correctly) with expecting W's (stupid) mistakes?

   
The assumption is that you suddenly cannot trust MC to do what it does
best even though you did for the entire game up until this point.   MC
of course will choose the gambling move.  The whole concept of MC
is to do what is most likely to produce a win.  

We should think twice before asking it to choose the moves that produces
the more sure loss.We are the ones that have a bias about this, not
the MC programs.  

Whatever the 'concept' of MC is, I just will improve its weakness by
any means.  MC is just a method, IMHO, to evaluate a position better
than to use a static function.  # I'm just an engineer :).

 In addition to above, there is one more issue to consider. If the 
 playout has a systematic error, nakade for example, it's not good to 
 keep 0.5 pt ahead.  Having more margin is clearly better.
   
I believe nakade is a strawman.There are lots of things MC does
better and lot's of things it does less well.You can always find
positions that are hard or easy for your program to solve, but it isn't
intrinsic to this issue. I don't think you should weaken this
concept of playing for the best winning chances for the very few
positions where MC programs take longer to resolve the endgame and there
is a slight chance that it will win if it just happens to be enough to
cover the exact situation.   Because this is no solution - it is at best
a patch and would only work in some cases.   If I could do something
that didn't hurt the program in other ways, but might help certain
positions once in a while,  I would go for it. I've been in game
programming a long time - if you have a problem with certain types of
positions you really want a pointed solution that has little or no
impact on other positions.   You don't want to be going back and forth
fixing things up but you want to solve the problem as correctly as
possible the first time.   I'll call this principle, every solution
has a side effect but this is a pretty bad side effect.(I can't
tell you how many times I fixed something in my chess program with
some evaluation change only to find that I broke many other things at
the same time.)

Although I'm afraid my English skill is enough to understand such a
long paragraph correctly (shorter is better :), I believe it's just a
choice.  Some are good at developing an essential solution but not
all.  As I know I'm worse at that, I just do what I can.

It was developped for the First UEC Cup that features
Japanese rules which count seki differently so that all MC programs
need to have some margin.  RĂ©mi introduced a fixed margin by changing
komi for Crazy Stone and I've developed a dynamic one.  Do you
say we have to wait until we will develop an essential solution?

-Hideki

- Don



 The idea of floating komi helps above two.  I'd like to emphasize that 
 I know it's not a universal solution.  As it seems, however, very hard 
 to solve nakade problem, it could be a pracitical solution.

 -Hideki
 --
 [EMAIL PROTECTED] (Kato)
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

   
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/
--
[EMAIL PROTECTED] (Kato)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Floating komi

2008-03-05 Thread Christoph Birk

On Mar 5, 2008, at 11:58 AM, Don Dailey wrote:

Don Dailey wrote:

 not assuming that MC plays the best move.   The problem isn't the

assumptions I am making, but the assumptions others are making,  that
it's NOT playing the best move.You want to apply a fix to all
positions without really knowing which positions are a problem.


One last time: Nobody suggested a one fix for all positions/problems.
The floating komi was suggested to guide the UCT search along
certain lines of play during specific (close!) endgame positions.

Christoph

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


[computer-go] Floating komi

2008-03-04 Thread Hideki Kato
I'd like to give here an example to make things clear.

The conditions are:
1) Using digitizing scheme that maps real score to [0,1] (or [-1,1]) 
so that the program cannot distinguish losing/winning by 0.5 or 10.5 
pt at all.
2) Playouts include some foolish moves (usually with low 
but not zero probability), not to connect large groups in atari 
position for example, due to hold its randomness.
3) The position is at early endgame where there are no moves that 
gain greater than 2 pt, for example, in perfect play.
4) Black is behind by 0.5 pt.

The playouts may return winning but gambling move (perhaps with low 
probability) under above conditions, especialy in case of the number 
of playouts is small which is usually true on 19x19, and UCT will 
choose it.

The question is, which is better to keep 0.5 pt behind or to play 
gambling moves (here I mean such moves that B will lose many pts if W 
will answer correctly) with expecting W's (stupid) mistakes?


In addition to above, there is one more issue to consider. If the 
playout has a systematic error, nakade for example, it's not good to 
keep 0.5 pt ahead.  Having more margin is clearly better.

The idea of floating komi helps above two.  I'd like to emphasize that 
I know it's not a universal solution.  As it seems, however, very hard 
to solve nakade problem, it could be a pracitical solution.

-Hideki
--
[EMAIL PROTECTED] (Kato)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/