Re: [computer-go] Dynamic komi at high handicaps

2009-08-13 Thread Stefan Kaitschick


Maybe I should ask first, for clarity sake, is MCTS performance in
handicap games currently a problem?

Mark



Yes, it's a big problem. And thats not a matter of opinion.
MC bots, leading a game by a large margin, will give away their advantage 
lighly except for the last half point.
Even on a 9*9 board, even if the bot wins more games on even with 7.5 komi, 
that doesn't mean that it's impossible
for the human to win, giving a 2 stone handicap. All it needs is a single 
bot missjudgement after the game got close.
Granted, bots are really excellent at defending the last half point 
advantage tooth and claw. I'm just saying that it should

be impossible for the human to win on 2 stones, and it isn't.
If they are behind by a large margin they will play either random or ko 
threat type moves.
So there is a kind of symmetry here. Beeing too far ahead or behind ruins 
the bots plays.
The biggest practical problem right now is poor play against pros on a 19*19 
board, taking a large handicap.
Special fuseki patterns are only a patch. When, after a decent opening, the 
regular patterns take over, they usually immediately

start to work against the bots own previous moves.
Looking into the horses mouth, instead of invoking Aristotle, is really the 
only way to find out.
I had hoped that programmers would find the idea interesting enough to try 
it out.
Instead, I found myself in a hand waving contest. Granted, I started it, so 
I can't complain.
Thanks to Ingo for simulating dynamic komi by hand to give programmers 
something less speculative.
Btw, I played 2 games (as gogonuts) on KGS against goIngo(really ManyFaces). 
I won both on 5 stones. But in the first one, with komi adjusted by Ingo, I 
had to make a very critical invasion that should not really have worked. In 
the second game I won without problems.

At the time, Ingo adjusted the win rate for w to 50%.
Since then, with his limited trials, Ingo found out that adjusting the komi 
to give each side a 50% win rate isn't optimal. His current rule is to 
adjust to 42% for w. This is ofcourse only a crude start, but sophistication 
can only be introduced by programmers.


Stefan




___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Monte-Carlo Simulation Balancing

2009-08-13 Thread Isaac Deutsch
I admit I had trouble understanding the details of the paper. What I  
think is the biggest problem for applying this to bigger (up to 19x19)  
games is that you somehow need access to the true value of a move,  
i.e. it's a win or a loss. On the 5x5 board they used, this might be  
approximated pretty well, but there's no chance on 19x19 to do so.



Am 13.08.2009 um 05:14 schrieb Michael Williams:

After about the 5th reading, I'm concluding that this is an  
excellent paper.  Is anyone (besides the authors) doing research  
based on this?  There is a lot to do.




___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Dynamic komi at high handicaps

2009-08-13 Thread Tapani Raiko
I don't think the komi should be adjusted.

Instead:

Wouldn't random passing by black during the playouts model black making
mistakes much more accurately? The number of random passes should be
adjusted such that the playouts are close to 50/50. Adjusting the komi
would make black play greedily, while random passing during playouts
would make black play safe (rich men don't pick fights).

Tapani Raiko

Christoph Birk wrote:

 I think you got it the wrong way round.
 Without dynamic komi (in high ha
 ndicap games) even trillions of simulations
 with _not_ find a move that creates a winning line, because the is none,
 if the opponet has the same strength as you.
 WHITE has to assume that BLACK will make mistakes, otherwise there
 would be no handicap.

 Christoph
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/


-- 
 Tapani Raiko, tapani.ra...@tkk.fi, +358 50 5225750
 http://www.iki.fi/raiko/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Monte-Carlo Simulation Balancing

2009-08-13 Thread Michael Williams
In future papers they should avoid using a strong authority like Fuego for the training and instead force it to learn from a naive uniform random playout policy 
(with 100x or 1000x more playouts) and then build on that with an iterative approach (as was suggested in the paper).


I also had another thought.  Since they are training the policy to maximize the balance and not the winrate, wouldn't you be able to extract more information 
from each trial by using the score instead of the game result?  The normal pitfalls to doing so do not apply here.




Isaac Deutsch wrote:
I admit I had trouble understanding the details of the paper. What I 
think is the biggest problem for applying this to bigger (up to 19x19) 
games is that you somehow need access to the true value of a move, 
i.e. it's a win or a loss. On the 5x5 board they used, this might be 
approximated pretty well, but there's no chance on 19x19 to do so.



Am 13.08.2009 um 05:14 schrieb Michael Williams:

After about the 5th reading, I'm concluding that this is an excellent 
paper.  Is anyone (besides the authors) doing research based on this?  
There is a lot to do.




___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/



___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Dynamic komi at high handicaps

2009-08-13 Thread Hideki Kato
I'd like to say that the problem comes from the fact the model of
the opponent in the simulations is not enough accurate in MCTS
flamework.  So, the solution is to make the model being more precise
but this has practically no sense.

What is Komi or handicap?  Since W is stronger than B, W must gain
some points in future at any position in a game.  Handicap can be
thought as that points at the initial position (assuming handicap
stones can be converted to handicap points). 

Hence, handicap points could be used to correct the model of the
opponent.  For example, assuming 7 stones handicap is equivalent to 70
points, we can use 70 for the hidden komi at the beginning and
decrease it towards the end of the game.

Does this make sense?

Hideki

Don Dailey: 5212e61a0908121411y3198e9d9m55441378fa01...@mail.gmail.com:
The problem with MCTS programs  is that they like to consolidate.   You set
the komi and thereby give them a goal and they very quickly make moves which
commit to that specific goal.   Commiting to less than you need to actually
win will often involve sacrificing chances to win.Sometime it won't,
but you cannot have a scalable algorithm which is this arbitrary.

However, if the handicap is too high, the program thinks every line is a
loss and it plays randomly.   That's why we even consider doing this.

Dynamically changing komi could be of some benefit in that situation if
there is no alternative reasonable strategy,   but it does not address the
real problem - which is what I call the committal consolidation
problem.  You are giving the program an arbitrary short term goal which
may,  or may not be compatible with the long term goal of winning the
game. Whether it's compatible or not is based on your own credulity -
not anything predictible or that you can scale.   And as the base program
gets stronger this aspect of the program becomes more and more of a wart.

If this can be made to work in the short term,  it should be considered a
temporary hack which should be fixed as soon as possible.

We have to think about this anyway sooner or later because if programs
continue to develop and the predictive ability of the playouts and tree
search gets several hundred ELO better,  these programs may start to see
more and more positions as either dead won or dead lost.  I'm sure we
will want some kind of robust mechanism for dealing with this which is
better at estimating chances that the opponent will go wrong  as opposed to
doing something that is a random benefit or hindrance.

- Don







2009/8/12 terry mcintyre terrymcint...@yahoo.com

 Ingo suggested something interesting - instead of changing the komi
 according to the move number, or some other fixed schedule, it varies
 according to the estimated winrate.

 It also, implicitly, depends on one's guess of the ability of the opponent.


 An interesting test would be to take an opponent known to be weaker, offer
 it a handicap, and tweak the dynamic komi per Ingo's suggestion. At what
 handicap does the ratio balance at 50:50? Can the number of handicap stones
 be increased with such an adaptive algorithm?

 Even better, play against a stronger opponent; can one increase the win
 rate versus strong opponents?

 The usual range of computer opponents is fairly narrow. None approach
 high-dan levels on 19x19 boards - yet.

 Terry McIntyre terrymcint...@yahoo.com

 “We hang the petty thieves and appoint the great ones to public office.” --
 Aesop
 --
 *From:* Brian Sheppard sheppar...@aol.com
 *To:* computer-go@computer-go.org
 *Sent:* Wednesday, August 12, 2009 12:33:13 PM
 *Subject:* [computer-go] Dynamic komi at high handicaps

 The small samples is probably the least of the problems with this.  Do you
 actually believe that you can play games against it and not be subjective
 in
 your observations or how you play against it?

 These are computer-vs-computer games. Ingo is manually transferring moves
 between two computer opponents.

 The result does support Ingo's belief that dynamic Komi will help programs
 play high handicap games. Due to small sample size it isn't very strong
 evidence. But maybe it is enough to induce a programmer who actually plays
 in such games to create a more exhaustive test.

 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/


 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

 inline file
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/
--
g...@nue.ci.i.u-tokyo.ac.jp (Kato)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Monte-Carlo Simulation Balancing

2009-08-13 Thread Jason House
A web search turned up a 2 page and an 8 page version. I read the  
short one. I agree that it's promising work that requires some follow- 
up research.


Now that you've read it so many times, what excites you about it? Can  
you envision a way to scale it to larger patterns and boards on modern  
hardware?


Sent from my iPhone

On Aug 12, 2009, at 11:14 PM, Michael Williams michaelwilliam...@gmail.com 
 wrote:


After about the 5th reading, I'm concluding that this is an  
excellent paper.  Is anyone (besides the authors) doing research  
based on this?  There is a lot to do.



David Silver wrote:

Hi everyone,
Please find attached my ICML paper with Gerry Tesauro on  
automatically learning a simulation policy for Monte-Carlo Go. Our  
preliminary results show a 200+ Elo improvement over previous  
approaches, although our experiments were restricted to simple  
Monte-Carlo search with no tree on small boards.

Abstract
In this paper we introduce the first algorithms for efficiently  
learning a simulation policy for Monte-Carlo search. Our main idea  
is to optimise the balance of a simulation policy, so that an  
accurate spread of simulation outcomes is maintained, rather than  
optimising the direct strength of the simulation policy. We develop  
two algorithms for balancing a simulation policy by gradient  
descent. The first algorithm optimises the balance of complete  
simulations, using a policy gradient algorithm; whereas the second  
algorithm optimises the balance over every two steps of simulation.  
We compare our algorithms to reinforcement learning and supervised  
learning algorithms for maximising the strength of the simulation  
policy. We test each algorithm in the domain of 5x5 and 6x6  
Computer Go, using a softmax policy that is parameterised by  
weights for a hundred simple patterns. When used in a simple Monte- 
Carlo search, the policies learnt by simulation balancing achieved  
significantly better performance, with half the mean squared error  
of a uniform random policy, and equal overall performance to a  
sophisticated Go engine.

-Dave
--- 
-

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


[computer-go] Monte-Carlo Simulation Balancing

2009-08-13 Thread Brian Sheppard
Is anyone (besides the authors) doing research based on this?

Well, Pebbles does apply reinforcement learning (RL) to improve
its playout policy. But not in the manner described in that paper.
There are practical obstacles to directly applying that paper.

To directly apply that paper, you must have a CrazyStone
playout design, wherein you maintain 3x3 neighborhoods around
each point. Pebbles has a Mogo playout design, where you check
for patterns only around the last move (or two).

To directly pursue this would require a rewrite. Right now, there
is no published evidence that the Mogo design is inferior. In fact,
two of the world's best programs use the Mogo design (Mogo, Fuego).
So I am unwilling to make that commitment.

I would also have to research how to scale that paper to
realistic conditions, including

   1) 9x9 boards at a minimum.
   2) Self-play, instead of assuming an oracle.
   3) Playout after a UCT/RAVE search rather than pure MC.
   4) Pattern sets that have ~1 million parameters.
   5) Pattern sets that have more general geometry than 3x3, perhaps.

My guess is that all of these research problems are solvable. But
that's a lot of work to do. If I had to face this task list, I
would put it off until later, because there is always an easier
way to make progress.


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Dynamic komi at high handicaps

2009-08-13 Thread Don Dailey
This idea makes much more sense to me than adjusting komi does.At least
it's an attempt at opponent modeling, which is the actual problem that
should be addressed. Whether it will actually work is something that
could be tested.

Another similar idea is not to pass but to play some percentage of random
moves - which probably would work in programs with strong playout
strategies.   Of course this would be meaningless for bots that have weak
(and already random) playout strategies.

- Don




On Thu, Aug 13, 2009 at 6:17 AM, Tapani Raiko pra...@cis.hut.fi wrote:

 I don't think the komi should be adjusted.

 Instead:

 Wouldn't random passing by black during the playouts model black making
 mistakes much more accurately? The number of random passes should be
 adjusted such that the playouts are close to 50/50. Adjusting the komi
 would make black play greedily, while random passing during playouts
 would make black play safe (rich men don't pick fights).

 Tapani Raiko

 Christoph Birk wrote:
 
  I think you got it the wrong way round.
  Without dynamic komi (in high ha
  ndicap games) even trillions of simulations
  with _not_ find a move that creates a winning line, because the is none,
  if the opponet has the same strength as you.
  WHITE has to assume that BLACK will make mistakes, otherwise there
  would be no handicap.
 
  Christoph
  ___
  computer-go mailing list
  computer-go@computer-go.org
  http://www.computer-go.org/mailman/listinfo/computer-go/
 
 
 --
  Tapani Raiko, tapani.ra...@tkk.fi, +358 50 5225750
  http://www.iki.fi/raiko/

 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-13 Thread Don Dailey
On Thu, Aug 13, 2009 at 1:39 AM, Christoph Birk b...@ociw.edu wrote:


 On Aug 12, 2009, at 3:43 PM, Don Dailey wrote:

 I believe the only thing wrong with the current MCTS strategy is that you
 cannot get a statistical meaningful number of samples when almost all games
 are won or lost.You can get more meanful NUMBER of samples by adjusting
 komi,  but unfortunately you are sampling the wrong thing - an approximation
 of the actual goal.
 Since the approximation may be wrong or right,  your algorithm is not
 scalable.   You could run on a billion processors sampling billions of nodes
 per seconds and with no flaw to the search or the playouts still play a move
 that gives you no chances of winning.


 I think you got it the wrong way round.
 Without dynamic komi (in high ha
 ndicap games) even trillions of simulations
 with _not_ find a move that creates a winning line, because the is none,
 if the opponet has the same strength as you.
 WHITE has to assume that BLACK will make mistakes, otherwise there
 would be no handicap.


I'm not trying to define the problem - that has already been done and I
agree with you - if the situation is hopeless the computer will play
randomly regardless of the number of playouts.

I'm explaining why this solution is imperfect and not scalable.   I did not
say it would make it play worse than nothing at all.

- Don





 Christoph

 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-13 Thread terry mcintyre
One reason dynamic komi seems a bit odd is that the numbers are pulled out of 
thin air. Why should the komi be X instead of Y? When should the value be 
changed?

Going back to the original thought experiment: the komi at the start of the 
game should reflect the expert assessment of how far ahead black is compared to 
white. 

A rational program should periodically change that assessment, as black 
blunders through the game.

So, my next question is, what sort of experimentation has been done to assess 
the likely score at various parts of the game? Any results?

It seems natural that a strong player, looking at a lot of handicap stones, 
will recognize that the position against an equally strong player would entail 
a loss - but of about n*10 stones, not of n*20 stones - and set an interim goal 
to acquire that much, while leaving opportunities for black to fumble. As such 
fumbles occur, white opportunistically consolidates more territory, and 
expectations are adjusted upwards -- only a 40 point loss now ... only 10 
points now ... striking distance ... black is torpedoed now!



  ___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-13 Thread Isaac Deutsch
With crazystone-like playouts, you can just put noise over the  
possibilites. the more noise, the more random the playout is, which is  
weaker. The best move in the tree is then the one that requires the  
least amount of noise for the other player to reach 50% win chance if  
behind, or the one that requires the most amount of noise for me if  
ahead. Would that work?


Am 13.08.2009 um 16:02 schrieb Don Dailey:

This idea makes much more sense to me than adjusting komi does. 
At least it's an attempt at opponent modeling, which is the actual  
problem that should be addressed. Whether it will actually work  
is something that could be tested.


Another similar idea is not to pass but to play some percentage of  
random moves - which probably would work in programs with strong  
playout strategies.   Of course this would be meaningless for bots  
that have weak (and already random) playout strategies.


- Don




On Thu, Aug 13, 2009 at 6:17 AM, Tapani Raiko pra...@cis.hut.fi  
wrote:

I don't think the komi should be adjusted.

Instead:

Wouldn't random passing by black during the playouts model black  
making

mistakes much more accurately? The number of random passes should be
adjusted such that the playouts are close to 50/50. Adjusting the komi
would make black play greedily, while random passing during playouts
would make black play safe (rich men don't pick fights).

Tapani Raiko

Christoph Birk wrote:

 I think you got it the wrong way round.
 Without dynamic komi (in high ha
 ndicap games) even trillions of simulations
 with _not_ find a move that creates a winning line, because the is  
none,

 if the opponet has the same strength as you.
 WHITE has to assume that BLACK will make mistakes, otherwise there
 would be no handicap.

 Christoph
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/


--
 Tapani Raiko, tapani.ra...@tkk.fi, +358 50 5225750
 http://www.iki.fi/raiko/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-13 Thread Stefan Kaitschick
Modeling the opponents mistakes is indeed an alternative to introducing komi.
But it would have to be a lot more exact than simply rolling the dice or 
skipping a move here and there.
Successful opponent modeling would implement the overplay school of thought - 
playing tactically refutable
combinations that are beyond the opponents skill to punish them.
Introducing komi at the 50% win rate level would implement the honte school of 
thought - play as if against yourself.
At a win rate of less than 50% it implements the almost honte school of 
thought. :-)
I'm not trying to moralize. In love and go anything is fair.
I'm just saying that while both approaches are legitimate, adjusting the komi 
is much easier to do.

Different subject, suggestion for a komi adjustment scheme:

1. Make a regular evaluation(no extra komi)
2. If the win rate of the best move is within certain bounds you're done
(Say between 30 and 70 percent.Just a guess ofcourse.Also, this might shift as 
the game progresses)
3. If not, make a komi adjustment dependant on how far out of bounds the win 
rate is.
(No numerical suggestion here. Please experiment.)
4. Make a new search with this komi.
5. If the new result is in bounds calculate winrate_nokomi * factor + 
winrate_komi for each candidate and choose the highest one.
(factor around 10 maybe)
6. If not, go back to 3


The idea is to choose a move that doesnt contradict the long term goal(no komi 
search) while trying for a short term goal(komi search)
if no long term goal is available.( Or if every move satisfies the long term 
goal in case of taking handicap)


Stefan



  - Original Message - 
  From: Don Dailey 
  To: tapani.ra...@tkk.fi ; computer-go 
  Sent: Thursday, August 13, 2009 4:02 PM
  Subject: Re: [computer-go] Dynamic komi at high handicaps


  This idea makes much more sense to me than adjusting komi does.At least 
it's an attempt at opponent modeling, which is the actual problem that should 
be addressed. Whether it will actually work is something that could be 
tested.

  Another similar idea is not to pass but to play some percentage of random 
moves - which probably would work in programs with strong playout strategies.   
Of course this would be meaningless for bots that have weak (and already 
random) playout strategies.

  - Don





  On Thu, Aug 13, 2009 at 6:17 AM, Tapani Raiko pra...@cis.hut.fi wrote:

I don't think the komi should be adjusted.

Instead:

Wouldn't random passing by black during the playouts model black making
mistakes much more accurately? The number of random passes should be
adjusted such that the playouts are close to 50/50. Adjusting the komi
would make black play greedily, while random passing during playouts
would make black play safe (rich men don't pick fights).

Tapani Raiko


Christoph Birk wrote:

 I think you got it the wrong way round.
 Without dynamic komi (in high ha
 ndicap games) even trillions of simulations
 with _not_ find a move that creates a winning line, because the is none,
 if the opponet has the same strength as you.
 WHITE has to assume that BLACK will make mistakes, otherwise there
 would be no handicap.

 Christoph
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/



--
 Tapani Raiko, tapani.ra...@tkk.fi, +358 50 5225750
 http://www.iki.fi/raiko/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/





--


  ___
  computer-go mailing list
  computer-go@computer-go.org
  http://www.computer-go.org/mailman/listinfo/computer-go/___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-13 Thread Don Dailey
There is one crude way to measure goal compatibility.   See if you can make
the same move work with different komi.If I'm on the east coast of the
US traveling to the west coast,  I will probably start off on the same road
regardless of whether I'm going to Seattle or San Diego.If the same road
does not work,  then I'm facing a critical decision point.

So it's probably safe to search for a move that works reasonable well with
different komi. If you cannot do this, you probably have goals that are
not compatible.But if you find a move that works well when the score is
50-50 (by manipulating the komi) then you should see if it's compatible with
a tougher goal.This will at least be some evidence that you are looking
at a common sense move and not a move that commits you to the wrong plan.

But if you have a move that returns a really high score with one komi,  but
raising it up just a bit makes it drop to zero,  you are in trouble with
that move. Try to find a move that may not be quite as good in the first
case, but is much better in the second case.

Unfortunately, I don't think there is a simple way to implement this.

Has anyone tried scoring where the total area was folded in to the main
score,  perhaps as much less signifant bits of the score?This makes
winning big a secondary goal.


- Don
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-13 Thread Don Dailey
2009/8/13 Stefan Kaitschick stefan.kaitsch...@hamburg.de

  Modeling the opponents mistakes is indeed an alternative to introducing
 komi.
 But it would have to be a lot more exact than simply rolling the dice or
 skipping a move here and there.
 Successful opponent modeling would implement the overplay school of thought
 - playing tactically refutable
 combinations that are beyond the opponents skill to punish them.


I cannot believe you are being so technically precise about doing this
correctly while advocating something on the other hand which is so obviously
incorrect.

You probably have something here though.I think the play-out policy is a
more fruitful area to explore than dynamically changing komi.

I would start simple, just trying the simplest approach first then gradually
refining it.   Random occasional pass moves is certainly easy to implement
as a first step.

- Don




 Introducing komi at the 50% win rate level would implement the honte school
 of thought - play as if against yourself.
 At a win rate of less than 50% it implements the almost honte school of
 thought. :-)
 I'm not trying to moralize. In love and go anything is fair.
 I'm just saying that while both approaches are legitimate, adjusting the
 komi is much easier to do.

 Different subject, suggestion for a komi adjustment scheme:

 1. Make a regular evaluation(no extra komi)
 2. If the win rate of the best move is within certain bounds you're done
 (Say between 30 and 70 percent.Just a guess ofcourse.Also, this might shift
 as the game progresses)
 3. If not, make a komi adjustment dependant on how far out of bounds the
 win rate is.
 (No numerical suggestion here. Please experiment.)
 4. Make a new search with this komi.
 5. If the new result is in bounds calculate winrate_nokomi * factor +
 winrate_komi for each candidate and choose the highest one.
 (factor around 10 maybe)
 6. If not, go back to 3


 The idea is to choose a move that doesnt contradict the long term goal(no
 komi search) while trying for a short term goal(komi search)
 if no long term goal is available.( Or if every move satisfies the long
 term goal in case of taking handicap)


 Stefan




 - Original Message -
 *From:* Don Dailey dailey@gmail.com
 *To:* tapani.ra...@tkk.fi ; computer-go computer-go@computer-go.org
 *Sent:* Thursday, August 13, 2009 4:02 PM
 *Subject:* Re: [computer-go] Dynamic komi at high handicaps

 This idea makes much more sense to me than adjusting komi does.At least
 it's an attempt at opponent modeling, which is the actual problem that
 should be addressed. Whether it will actually work is something that
 could be tested.

 Another similar idea is not to pass but to play some percentage of random
 moves - which probably would work in programs with strong playout
 strategies.   Of course this would be meaningless for bots that have weak
 (and already random) playout strategies.

 - Don




 On Thu, Aug 13, 2009 at 6:17 AM, Tapani Raiko pra...@cis.hut.fi wrote:

 I don't think the komi should be adjusted.

 Instead:

 Wouldn't random passing by black during the playouts model black making
 mistakes much more accurately? The number of random passes should be
 adjusted such that the playouts are close to 50/50. Adjusting the komi
 would make black play greedily, while random passing during playouts
 would make black play safe (rich men don't pick fights).

 Tapani Raiko

 Christoph Birk wrote:
 
  I think you got it the wrong way round.
  Without dynamic komi (in high ha
  ndicap games) even trillions of simulations
  with _not_ find a move that creates a winning line, because the is none,
  if the opponet has the same strength as you.
  WHITE has to assume that BLACK will make mistakes, otherwise there
  would be no handicap.
 
  Christoph
  ___
  computer-go mailing list
  computer-go@computer-go.org
  http://www.computer-go.org/mailman/listinfo/computer-go/
 
 
 --
  Tapani Raiko, tapani.ra...@tkk.fi, +358 50 5225750
  http://www.iki.fi/raiko/

 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/


  --

 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/


 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-13 Thread terry mcintyre
The dynamic komi is perhaps a misnomer; it's by accident that changing komi 
reflects something which we do want to measure, namely the predicted score. 

An algorithm which does not make use of the predicted score would not make use 
of all available information. 

On a 19x19 board, it is common for some areas to become settled; whether 
unconditionally alive, or ( more likely ) alive under the assumption of 
alternating play. Many moves trade the prospect of territory here versus there. 
Bad moves give up too much for too little. Good moves exploit bad or slack 
moves, and provide an equitable balance against good play.

 Terry McIntyre terrymcint...@yahoo.com


“We hang the petty thieves and appoint the great ones to public office.” -- 
Aesop





From: Don Dailey dailey@gmail.com
To: computer-go computer-go@computer-go.org
Sent: Thursday, August 13, 2009 9:27:11 AM
Subject: Re: [computer-go] Dynamic komi at high handicaps




2009/8/13 Stefan Kaitschick stefan.kaitsch...@hamburg.de

Modeling the opponents mistakes is indeed an 
alternative to introducing komi.
But it would have to be a lot more exact than 
simply rolling the dice or skipping a move here and there.
Successful opponent modeling would implement the 
overplay school of thought - playing tactically refutable
combinations that are beyond the opponents skill to 
punish them.

I cannot believe you are being so technically precise about doing this 
correctly while advocating something on the other hand which is so obviously 
incorrect.

You probably have something here though.I think the play-out policy is a 
more fruitful area to explore than dynamically changing komi.   

I would start simple, just trying the simplest approach first then gradually 
refining it.   Random occasional pass moves is certainly easy to implement as a 
first step.

- Don


 
Introducing komi at the 50% win rate level would 
implement the honte school of thought - play as if against 
yourself.
At a win rate of less than 50% it implements the 
almost honte school of thought. :-)
I'm not trying to moralize. In love and go anything 
is fair.
I'm just saying that while both approaches are 
legitimate, adjusting the komi is much easier to do.
 
Different subject, suggestion for a komi adjustment 
scheme:
 
1. Make a regular evaluation(no extra 
komi)
2. If the win rate of the best move is within 
certain bounds you're done
(Say between 30 and 70 percent.Just a 
guess ofcourse.Also, this might shift as the game progresses)
3. If not, make a komi adjustment dependant on how 
far out of bounds the win rate is.
(No numerical suggestion here. Please 
experiment.)
4. Make a new search with this komi.
5. If the new result is in bounds calculate 
winrate_nokomi * factor + winrate_komi for each candidate and choose the 
highest 
one.
(factor around 10 maybe)
6. If not, go back to 3
 
 
The idea is to choose a move that doesnt contradict 
the long term goal(no komi search) while trying for a short term goal(komi 
search)
if no long term goal is available.( Or if every 
move satisfies the long term goal in case of taking handicap)
  
Stefan
 
 
 
- Original Message - 
From: Don 
  Dailey 
To: tapani.ra...@tkk.fi ; computer-go 
Sent: Thursday, August 13, 2009 4:02 
  PM
Subject: Re: [computer-go] Dynamic komi 
  at high handicaps


This idea makes much more sense to me than adjusting komi 
  does.At least it's an attempt at opponent modeling, which 
  is the actual problem that should be addressed. 
  Whether it will actually work is something that could be 
  tested.

Another similar idea is not to pass but to play some percentage 
  of random moves - which probably would work in programs with strong playout 
  strategies.   Of course this would be meaningless for bots that have 
  weak (and already random) playout strategies.

- Don





On Thu, Aug 13, 2009 at 6:17 AM, Tapani Raiko pra...@cis.hut.fi wrote:

I don't think the komi should be 
adjusted.

Instead:

Wouldn't random passing by black during the 
playouts model black making
mistakes much more accurately? The number of 
random passes should be
adjusted such that the playouts are close to 
50/50. Adjusting the komi
would make black play greedily, while random 
passing during playouts
would make black play safe (rich men don't pick 
fights).

Tapani Raiko


Christoph Birk wrote:

 I think you got it 
the wrong way round.
 Without dynamic komi (in high ha
 ndicap 
games) even trillions of simulations
 with _not_ find a move that 
creates a winning line, because the is none,
 if the opponet has the 
same strength as you.
 WHITE has to assume that BLACK will make 
mistakes, otherwise there
 would be no handicap.

 
Christoph
 ___
 
computer-go mailing list
 computer-go@computer-go.org
 
 http://www.computer-go.org/mailman/listinfo/computer-go/


--
 Tapani Raiko, 

Re: [computer-go] Dynamic komi at high handicaps

2009-08-13 Thread Don Dailey
2009/8/13 terry mcintyre terrymcint...@yahoo.com

 The dynamic komi is perhaps a misnomer; it's by accident that changing
 komi reflects something which we do want to measure, namely the predicted
 score.

 An algorithm which does not make use of the predicted score would not make
 use of all available information.


You imply that it's some kind of travesty not to make use of available
information, but the information is only available because you did extra
work to generate it. And even if the information was free,  it matters
that you use it correctly.  Just implying that it's wrong not to do this
because you are throwing away available information is not going to cut
it.






 On a 19x19 board, it is common for some areas to become settled; whether
 unconditionally alive, or ( more likely ) alive under the assumption of
 alternating play. Many moves trade the prospect of territory here versus
 there. Bad moves give up too much for too little. Good moves exploit bad or
 slack moves, and provide an equitable balance against good play.


I think you have describe very well why dynamic komi is so hard.   You take
ALL these factors and ignore them all except for a single number.   You
don't know anything about the composition of that number.If your concern
is about throwing away infromation, then dynamic komi should be a big
problem for you.



- Don






 Terry McIntyre terrymcint...@yahoo.com

 “We hang the petty thieves and appoint the great ones to public office.” --
 Aesop

 --
 *From:* Don Dailey dailey@gmail.com
 *To:* computer-go computer-go@computer-go.org
 *Sent:* Thursday, August 13, 2009 9:27:11 AM

 *Subject:* Re: [computer-go] Dynamic komi at high handicaps



 2009/8/13 Stefan Kaitschick stefan.kaitsch...@hamburg.de

  Modeling the opponents mistakes is indeed an alternative to introducing
 komi.
 But it would have to be a lot more exact than simply rolling the dice or
 skipping a move here and there.
 Successful opponent modeling would implement the overplay school of
 thought - playing tactically refutable
 combinations that are beyond the opponents skill to punish them.


 I cannot believe you are being so technically precise about doing this
 correctly while advocating something on the other hand which is so obviously
 incorrect.

 You probably have something here though.I think the play-out policy is
 a more fruitful area to explore than dynamically changing komi.

 I would start simple, just trying the simplest approach first then
 gradually refining it.   Random occasional pass moves is certainly easy to
 implement as a first step.

 - Don




  Introducing komi at the 50% win rate level would implement the honte
 school of thought - play as if against yourself.
 At a win rate of less than 50% it implements the almost honte school of
 thought. :-)
 I'm not trying to moralize. In love and go anything is fair.
 I'm just saying that while both approaches are legitimate, adjusting the
 komi is much easier to do.

 Different subject, suggestion for a komi adjustment scheme:

 1. Make a regular evaluation(no extra komi)
 2. If the win rate of the best move is within certain bounds you're done
 (Say between 30 and 70 percent.Just a guess ofcourse.Also, this might
 shift as the game progresses)
 3. If not, make a komi adjustment dependant on how far out of bounds the
 win rate is.
 (No numerical suggestion here. Please experiment.)
 4. Make a new search with this komi.
 5. If the new result is in bounds calculate winrate_nokomi * factor +
 winrate_komi for each candidate and choose the highest one.
 (factor around 10 maybe)
 6. If not, go back to 3


 The idea is to choose a move that doesnt contradict the long term goal(no
 komi search) while trying for a short term goal(komi search)
 if no long term goal is available.( Or if every move satisfies the long
 term goal in case of taking handicap)


 Stefan




 - Original Message -
  *From:* Don Dailey dailey@gmail.com
 *To:* tapani.ra...@tkk.fi ; computer-go computer-go@computer-go.org
 *Sent:* Thursday, August 13, 2009 4:02 PM
 *Subject:* Re: [computer-go] Dynamic komi at high handicaps

 This idea makes much more sense to me than adjusting komi does.At
 least it's an attempt at opponent modeling, which is the actual problem that
 should be addressed. Whether it will actually work is something that
 could be tested.

 Another similar idea is not to pass but to play some percentage of random
 moves - which probably would work in programs with strong playout
 strategies.   Of course this would be meaningless for bots that have weak
 (and already random) playout strategies.

 - Don




 On Thu, Aug 13, 2009 at 6:17 AM, Tapani Raiko pra...@cis.hut.fi wrote:

 I don't think the komi should be adjusted.

 Instead:

 Wouldn't random passing by black during the playouts model black making
 mistakes much more accurately? The number of random passes should be
 adjusted such that the 

Re: [computer-go] Dynamic komi at high handicaps

2009-08-13 Thread terry mcintyre
I have never heard a pro say I estimate my chances of winning this game to be 
50.3%, but you will hear black is ahead by 3 points or white wins by 1/2 
point. -- they'll make this evaluation based on the alternation of equally 
competent play. 


 Terry McIntyre terrymcint...@yahoo.com


“We hang the petty thieves and appoint the great ones to public office.” -- 
Aesop





From: Don Dailey dailey@gmail.com
To: computer-go computer-go@computer-go.org
Sent: Thursday, August 13, 2009 10:20:58 AM
Subject: Re: [computer-go] Dynamic komi at high handicaps




2009/8/13 terry mcintyre terrymcint...@yahoo.com

The dynamic komi is perhaps a misnomer; it's by accident that changing komi 
reflects something which we do want to measure, namely the predicted score. 

An algorithm which does not make use of the predicted score would not make use 
of all available information. 

You imply that it's some kind of travesty not to make use of available 
information, but the information is only available because you did extra work 
to generate it. And even if the information was free,  it matters that you 
use it correctly.  Just implying that it's wrong not to do this because you 
are throwing away available information is not going to cut it.  



 


On a 19x19 board, it is common for some areas to become settled; whether 
unconditionally alive, or ( more likely ) alive under the assumption of 
alternating play. Many moves trade the prospect of territory here versus 
there. Bad moves give up too much for too little. Good moves exploit bad or 
slack moves, and provide an equitable balance against good play.

I think you have describe very well why dynamic komi is so hard.   You take ALL 
these factors and ignore them all except for a single number.   You don't know 
anything about the composition of that number.If your concern is about 
throwing away infromation, then dynamic komi should be a big problem for you.



- Don


 


 Terry McIntyre terrymcint...@yahoo.com


“We hang the petty thieves and appoint the great ones to public office.” --
 Aesop






From: Don Dailey dailey@gmail.com
To: computer-go computer-go@computer-go.org
Sent: Thursday, August 13, 2009 9:27:11 AM

Subject: Re: [computer-go] Dynamic komi at high handicaps





2009/8/13 Stefan Kaitschick stefan.kaitsch...@hamburg.de

Modeling the opponents mistakes is indeed an 
alternative to introducing komi.
But it would have to be a lot more exact than 
simply rolling the dice or skipping a move here and there.
Successful opponent modeling would implement the 
overplay school of thought - playing tactically refutable
combinations that are beyond the opponents skill to 
punish them.

I cannot believe you are being so technically precise about doing this 
correctly while advocating something on the other hand which is so obviously 
incorrect.

You probably have something here though.I think the play-out policy is a 
more fruitful area to explore than dynamically changing komi.   

I would start simple, just trying the simplest approach first then gradually 
refining it.   Random occasional pass moves is certainly easy to implement as 
a first step.

- Don


 
Introducing komi at the 50% win rate level would 
implement the honte school of thought - play as if against 
yourself.
At a win rate of less than 50% it implements the 
almost honte school of thought. :-)
I'm not trying to moralize. In love and go anything 
is fair.
I'm just saying that while both approaches are 
legitimate, adjusting the komi is much easier to do.
 
Different subject, suggestion for a komi adjustment 
scheme:
 
1. Make a regular evaluation(no extra 
komi)
2. If the win rate of the best move is within 
certain bounds you're done
(Say between 30 and 70 percent.Just a 
guess ofcourse.Also, this might shift as the game progresses)
3. If not, make a komi adjustment dependant on how 
far out of bounds the win rate is.
(No numerical suggestion here. Please 
experiment.)
4. Make a new search with this komi.
5. If the new result is in bounds calculate 
winrate_nokomi * factor + winrate_komi for each candidate and choose the 
highest 
one.
(factor around 10 maybe)
6. If not, go back to 3
 
 
The idea is to choose a move that doesnt contradict 
the long term goal(no komi search) while trying for a short term goal(komi 
search)
if no long term goal is available.( Or if every 
move satisfies the long term goal in case of taking handicap)
  
Stefan
 
 
 
- Original Message - 
From: Don 
  Dailey 
To: tapani.ra...@tkk.fi ; computer-go 
Sent: Thursday, August 13, 2009 4:02 
  PM
Subject: Re: [computer-go] Dynamic komi 
  at high handicaps


This idea makes much more sense to me than adjusting komi 
  does.At least it's an attempt at opponent modeling, which 
  is the actual problem that should be addressed. 
  Whether it will actually work is something that could be 
  tested.

Another similar idea 

Re: [computer-go] Monte-Carlo Simulation Balancing

2009-08-13 Thread Olivier Teytaud
 . Pebbles has a Mogo playout design, where you check
 for patterns only around the last move (or two).


In MoGo, it's not only around the last move (at least with some probability
and when there are empty spaces in the board); this is the fill board
modification.

(this provides a big improvement in 19x19 with big numbers of simulations,
see http://www.lri.fr/~rimmel/publi/EK_explo.pdf , Fig 3 page 8 - not only
quantitative improvement, but also, according to players, a qualitative
improvement in the way mogo plays)

Best regards,
Olivier
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

[computer-go] Monte-Carlo Simulation Balancing

2009-08-13 Thread Brian Sheppard
 . Pebbles has a Mogo playout design, where you check
 for patterns only around the last move (or two).


In MoGo, it's not only around the last move (at least with some probability
and when there are empty spaces in the board); this is the fill board
modification.

Just to clarify: I was not saying that Mogo's policy consisted
*solely* of looking for patterns around the last move. Merely that
it does not look for patterns around *every* point, which other
playout policies (e.g., CrazyStone, if I understand Remi's papers
correctly) appear to do. The RL paper seems to require that
playout design.

If you want to prioritize patterns around every point then you
need a more sophisticated board representation than Pebbles uses.

BTW, FillBoard seems to help Pebbles, too. A few percent better
on 9x9 games. No testing on larger boards. YMMV, and like everything
about computer go: all predictions are guaranteed to be wrong,
or your money back.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Monte-Carlo Simulation Balancing

2009-08-13 Thread Olivier Teytaud


 Just to clarify: I was not saying that Mogo's policy consisted
 *solely* of looking for patterns around the last move. Merely that
 it does not look for patterns around *every* point, which other
 playout policies (e.g., CrazyStone, if I understand Remi's papers
 correctly) appear to do. The RL paper seems to require that
 playout design.


Fine!



 BTW, FillBoard seems to help Pebbles, too. A few percent better
 on 9x9 games. No testing on larger boards. YMMV, and like everything
 about computer go: all predictions are guaranteed to be wrong,
 or your money back.


For us the improvement is essential in 19x19 - I'll find that for the
generality
of fillboard if it helps also for you :-) the loss of diversity due to
patterns
is really clear in some situations, so the problem solved by fillboard is
understandable,
so I believe it should work also for you - but, as you say, all predictions
in computer-go
are almost guaranteed to be wrong :-)
Olivier
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

[computer-go] Heavier playouts

2009-08-13 Thread David Fotland
A couple of weeks ago I made the playouts slightly heavier by adding a few
2-liberty local rules.  It made a big difference in the program's strength
(from strong 3 kyu to weak 1 kyu).

www.gokgs.com/servlet/graph/ManyFaces-en_US.png

David


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Heavier playouts

2009-08-13 Thread Robert Jasiek

David Fotland wrote:

made the playouts slightly heavier by adding a few
2-liberty local rules.


What does heavier mean here and could you please give an example of 
such a rule? Do you have an understanding why they make your program 
stronger?


--
robert jasiek
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


RE: [computer-go] Heavier playouts

2009-08-13 Thread David Fotland
Heavier means more analysis in the playouts about what move to make - less
pure random.  I don't understand why its stronger, but I'm happy with the
result.  Playouts are pretty much try something and test it.

David

 -Original Message-
 From: computer-go-boun...@computer-go.org [mailto:computer-go-
 boun...@computer-go.org] On Behalf Of Robert Jasiek
 Sent: Thursday, August 13, 2009 1:08 PM
 To: computer-go
 Subject: Re: [computer-go] Heavier playouts
 
 David Fotland wrote:
  made the playouts slightly heavier by adding a few
  2-liberty local rules.
 
 What does heavier mean here and could you please give an example of
 such a rule? Do you have an understanding why they make your program
 stronger?
 
 --
 robert jasiek
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


[computer-go] Re: Heavier playouts

2009-08-13 Thread Hideki Kato

David Fotland: 091c01ca1c4f$9dea69e0$d9bf3d...@com:
A couple of weeks ago I made the playouts slightly heavier by adding a few
2-liberty local rules.  It made a big difference in the program's strength
(from strong 3 kyu to weak 1 kyu).

www.gokgs.com/servlet/graph/ManyFaces-en_US.png

Is this URL correct?  I can't see that picture (IE8.0/WindowsXP).  Is 
that the rank graph of MFG?

Hideki
--
g...@nue.ci.i.u-tokyo.ac.jp (Kato)
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


RE: [computer-go] Re: Heavier playouts

2009-08-13 Thread David Fotland
Works for me.  It's the rank graph.  You can also get it on KGS, user info
for ManyFaces

 -Original Message-
 From: computer-go-boun...@computer-go.org [mailto:computer-go-
 boun...@computer-go.org] On Behalf Of Hideki Kato
 Sent: Thursday, August 13, 2009 9:42 PM
 To: computer-go
 Subject: [computer-go] Re: Heavier playouts
 
 
 David Fotland: 091c01ca1c4f$9dea69e0$d9bf3d...@com:
 A couple of weeks ago I made the playouts slightly heavier by adding a
 few
 2-liberty local rules.  It made a big difference in the program's
 strength
 (from strong 3 kyu to weak 1 kyu).
 
 www.gokgs.com/servlet/graph/ManyFaces-en_US.png
 
 Is this URL correct?  I can't see that picture (IE8.0/WindowsXP).  Is
 that the rank graph of MFG?
 
 Hideki
 --
 g...@nue.ci.i.u-tokyo.ac.jp (Kato)
 ___
 computer-go mailing list
 computer-go@computer-go.org
 http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/