date:20090812

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Stefan Kaitschick



Maybe I should ask first, for clarity sake, is MCTS performance in
handicap games currently a problem?

Mark



Yes, it's a big problem. And thats not a matter of opinion.
MC bots, leading a game by a large margin, will give away their advantage 
lighly except for the last half point.
Even on a 9*9 board, even if the bot wins more games on even with 7.5 komi, 
that doesn't mean that it's impossible
for the human to win, giving a 2 stone handicap. All it needs is a single 
bot missjudgement after the game got close.
Granted, bots are really excellent at defending the last half point 
advantage tooth and claw. I'm just saying that it should

be impossible for the human to win on 2 stones, and it isn't.
If they are behind by a large margin they will play either random or ko 
threat type moves.
So there is a kind of symmetry here. Beeing too far ahead or behind ruins 
the bots plays.
The biggest practical problem right now is poor play against pros on a 19*19 
board, taking a large handicap.
Special fuseki patterns are only a patch. When, after a decent opening, the 
regular patterns take over, they usually immediately

start to work against the bots own previous moves.
Looking into the horses mouth, instead of invoking Aristotle, is really the 
only way to find out.
I had hoped that programmers would find the idea interesting enough to try 
it out.
Instead, I found myself in a hand waving contest. Granted, I started it, so 
I can't complain.
Thanks to Ingo for simulating dynamic komi by hand to give programmers 
something less speculative.
Btw, I played 2 games (as gogonuts) on KGS against goIngo(really ManyFaces). 
I won both on 5 stones. But in the first one, with komi adjusted by Ingo, I 
had to make a very critical invasion that should not really have worked. In 
the second game I won without problems.

At the time, Ingo adjusted the win rate for w to 50%.
Since then, with his limited trials, Ingo found out that adjusting the komi 
to give each side a 50% win rate isn't optimal. His current rule is to 
adjust to 42% for w. This is ofcourse only a crude start, but sophistication 
can only be introduced by programmers.


Stefan




___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Christoph Birk



On Aug 12, 2009, at 10:31 PM, Petri Pitkanen wrote:

Maybe they are long way from giving handicaps to you. But best of bots
in KGS are around 2k and there are hundreds of  9k and weaker players
present there at all times. So being able to play white is worthy
thing at least for commercial bot.


That's correct. I have a more "academic" point of view.

Christoph

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Christoph Birk



On Aug 12, 2009, at 3:43 PM, Don Dailey wrote:
I believe the only thing wrong with the current MCTS strategy is  
that you cannot get a statistical meaningful number of samples when  
almost all games are won or lost.You can get more meanful  
NUMBER of samples by adjusting komi,  but unfortunately you are  
sampling the wrong thing - an approximation of the actual goal.
Since the approximation may be wrong or right,  your algorithm is  
not scalable.   You could run on a billion processors sampling  
billions of nodes per seconds and with no flaw to the search or the  
playouts still play a move that gives you no chances of winning.


I think you got it the wrong way round.
Without dynamic komi (in high ha
ndicap games) even trillions of simulations
with _not_ find a move that creates a winning line, because the is none,
if the opponet has the same strength as you.
WHITE has to assume that BLACK will make mistakes, otherwise there
would be no handicap.

Christoph
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Petri Pitkanen

Maybe they are long way from giving handicaps to you. But best of bots
in KGS are around 2k and there are hundreds of  9k and weaker players
present there at all times. So being able to play white is worthy
thing at least for commercial bot.

Petri

2009/8/13 Christoph Birk :
>
> On Aug 12, 2009, at 2:51 PM, Don Dailey wrote:
>>
>> I disagree.   I think strong players have a sense of what kind of mistakes
>> to expect, and try to provoke those mistakes.   Dynamic komi does not model
>> that.
>>
>> It also does the opposite of making the program play provocatively, which
>> I believe is necessary to beat a weaker player with a large handicap against
>> you.    Instead of making it fight,  it encourages the program to be content
>> with less.   How does this model strong handicap players?
>
> Maybe dynamic komi works better for BLACK? Computers are still
> a looong way from actually _giving_ a handicap.
>
> Christoph
>
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>



-- 
Petri Pitkänen
e-mail: petri.t.pitka...@gmail.com
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Christoph Birk



On Aug 12, 2009, at 3:10 PM, Don Dailey wrote:
If the handicap is fair,  their chance is about 50/50.   However,   
rigging komi to give the same chance is NOT what humans do.   The  
only thing you said that I consider correct is that humans estimate  
their chances to be about 50/50.
One thing humans do is to set short term goals and I think dynamic  
komi is an attempt to do that - but it's a misguided attempt  
because you are setting the WRONG short term goal.


Setting the komi to that the game is 50/50 creates the (correct)
short term goal of gaining a few points, then again, and again ...

Christoph
 
___

computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Christoph Birk



On Aug 12, 2009, at 2:51 PM, Don Dailey wrote:
I disagree.   I think strong players have a sense of what kind of  
mistakes to expect, and try to provoke those mistakes.   Dynamic  
komi does not model that.


It also does the opposite of making the program play provocatively,  
which I believe is necessary to beat a weaker player with a large  
handicap against you.Instead of making it fight,  it encourages  
the program to be content with less.   How does this model strong  
handicap players?


Maybe dynamic komi works better for BLACK? Computers are still
a looong way from actually _giving_ a handicap.

Christoph

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Monte-Carlo Simulation Balancing

2009-08-12 Thread Michael Williams


After about the 5th reading, I'm concluding that this is an excellent paper.  
Is anyone (besides the authors) doing research based on this?  There is a lot 
to do.


David Silver wrote:

Hi everyone,

Please find attached my ICML paper with Gerry Tesauro on automatically 
learning a simulation policy for Monte-Carlo Go. Our preliminary results 
show a 200+ Elo improvement over previous approaches, although our 
experiments were restricted to simple Monte-Carlo search with no tree on 
small boards.





Abstract

In this paper we introduce the first algorithms for efficiently learning 
a simulation policy for Monte-Carlo search. Our main idea is to optimise 
the balance of a simulation policy, so that an accurate spread of 
simulation outcomes is maintained, rather than optimising the direct 
strength of the simulation policy. We develop two algorithms for 
balancing a simulation policy by gradient descent. The first algorithm 
optimises the balance of complete simulations, using a policy gradient 
algorithm; whereas the second algorithm optimises the balance over every 
two steps of simulation. We compare our algorithms to reinforcement 
learning and supervised learning algorithms for maximising the strength 
of the simulation policy. We test each algorithm in the domain of 5x5 
and 6x6 Computer Go, using a softmax policy that is parameterised by 
weights for a hundred simple patterns. When used in a simple Monte-Carlo 
search, the policies learnt by simulation balancing achieved 
significantly better performance, with half the mean squared error of a 
uniform random policy, and equal overall performance to a sophisticated 
Go engine.


-Dave




___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

[computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Brian Sheppard

No thought experiments are going to convince me on this subject.
Someone will have to do an actual test. Ingo's work is the best
to date on the subject.

Anyone who is overly committed to thought experiments should
consider that we are talking about applying MCTS to Go, that most
deterministic of all games. The whole idea is absurd from a logical
perspective. Despite logic, some things just seem to work.

Maybe dynamic komi will work. Or maybe we need to maximize point
differential. Or maybe we just need to get stronger. Only actual
experiments and testing will tell.


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Mark Boon

2009/8/12 Don Dailey :
>
> If the program makes decisions about the best way to win N points,   there
> is no guarantee that this is ALSO the best way to win N+1 points.

Although this is obviously true, that doesn't automatically mean it's
not the best approach. Because there's a hidden assumption in there.
And that is it's not the best way to win by N+1, given proper play by
the opponent thereafter. If not perfect, then at least as strong as
the stronger player.

Whatever your strategy, even when you catch up a lot there's no
guarantee the opponent will keep making mistakes enough for you to
win. Human players generally do keep track whether they seem to be
catching up 'enough' and will take more risk when progress is not in
line with the progress of the game.

I don't think anyone is trying to argue that adjusting komi is the
perfect answer. But what apparently is observed (I never tried myself)
is that currently MCTS does poorly in handicap games. So the question
is whether adjusting the handicap would improve performance.

The positions seem to be entrenched. But I have yet to see conclusive
evidence or persuasive arguments one way or the other.

Maybe I should ask first, for clarity sake, is MCTS performance in
handicap games currently a problem?

Mark
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Matthew Woodcraft

Don Dailey wrote:
> Matthew Woodcraft wrote:
> > Don Dailey wrote:
> > > The problem with MCTS programs is that they like to consolidate. You
> > > set the komi and thereby give them a goal and they very quickly make
> > > moves which commit to that specific goal.
> >
> > How did you form this opinion? Can you show an example game record
> > (on 19x19) showing this behaviour?

> Your kidding, right?Does anyone honestly dispute this?

I believe it to be false, yes.

There are plenty of records of 19x19 MCTS computer games available. I
haven't seen one in which the computer very quickly committed to
anything, and I don't believe you have either.

I suggest your view of computer go may be distorted by too much
concentration on 9x9.

-M-

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Stefan Kaitschick

"For instance I am sure he will not sit merrily by and watch his opponent 
consolidate a won game just so that he can have a "respectable" but losing 
score.Dynamic komi of course does not address that at all."

This seems self evident, but it may actually be a treacherous conclusion.

Dynamic komi really only has one legitimate use. To avoid flat lining or, 
taking handicap, saturated win rates.
This doesnt mean that the program needs to be "satisfied" with losing.(A komi 
that put the win rate to 50% would model that)
A win rate range between 35 and 45 percent might make the program locally 
ambitious, without attempting the impossible with ko threat
type moves.

Stefan 

  - Original Message - 
  From: Don Dailey 
  To: computer-go 
  Sent: Thursday, August 13, 2009 12:10 AM
  Subject: Re: [computer-go] Dynamic komi at high handicaps

  On Wed, Aug 12, 2009 at 5:58 PM, Mark Boon  wrote:

2009/8/12 Don Dailey :

>
> I disagree about this being what humans do.   They do not set a fake komi
> and then try to win only by that much.

I didn't say that humans do that. I said they consider their chance
50-50. For an MC program to consider its chances to be 50-50 you'd
have to up the komi. There's a difference.

  If the handicap is fair,  their chance is about 50/50.   However,  rigging 
komi to give the same chance is NOT what humans do.   The only thing you said 
that I consider correct is that humans estimate their chances to be about 
50/50.

  One thing humans do is to set short term goals and I think dynamic komi is an 
attempt to do that - but it's a misguided attempt because you are setting the 
WRONG short term goal. The human will have a much more specific goal that 
is going to be compatible with his hope of winning the game.For instance I 
am sure he will not sit merrily by and watch his opponent consolidate a won 
game just so that he can have a "respectable" but losing score.Dynamic komi 
of course does not address that at all.

>
> I think their model is somewhat incremental, trying to win a bit at a time
> but I'm quite convinced that they won't just let the opponent consolidate
> like MCTS does.   With dynamic komi the program will STILL just try to
> consolidate and not care about what his opponent does.   But strong 
players
> will know that letting your opponent consolidate is not going to work.
So
> they will keep things complicated and challenge their weaker opponents
> everywhere that is important.
>

It's difficult to make hard claims about this. I don't agree at all
that the stronger player constantly needs to keep things complicated.
Personally I tend to play solidly when giving a handicap. Because most
damage is self-inflicted. You can either make a guess what the weaker
player doesn't know, or you can give him the initiative and he'll show
you. I prefer the latter approach.

When done properly, I don't see how an MCTS program would consolidate
all the time. Doing so would keep the position stable while the komi
declines. As soon as he gets behind the komi degradation curve play
will automatically get more dynamic in an attempt to catch up.

The problem is: we're speculating. The proof is in the pudding.

  Agreed. 

  - Don

Mark
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

--

  ___
  computer-go mailing list
  computer-go@computer-go.org
  http://www.computer-go.org/mailman/listinfo/computer-go/___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread terry mcintyre

As for how to beat weaker players ... the strong players whom I have observed 
make strong, stable positions; they wait for the weaker player to make 
mistakes. The stronger player will leave things unresolved for longer, knowing 
that there will be time to extend in one direction or another later in the 
game. 

They'll include some of the more interesting, obscure joseki, which the weaker 
player will not grok appropriately.

I have seen players who make unsound "trick plays", but these are the kyu 
players; dan-level players know that "trick plays" can become costly mistakes. 
They'll use them for teaching purposes, not to win - but that's a different 
kind of game entirely.


  ___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Stefan Kaitschick


"What seems difficult to me however is to devise a reasonable way to
decrease this komi as the game progresses"


Certainly that is the main problem. But the main considerations are not so 
hard to find


1. Win rate of the best move.
2. How far has the game progressed
3. deviation between the win rates of all possible moves.( with higher 
deviation dynamic komi is less called for)


Stefan

- Original Message - 
From: "Mark Boon" 

To: "computer-go" 
Sent: Wednesday, August 12, 2009 11:36 PM
Subject: Re: [computer-go] Dynamic komi at high handicaps



I started to write something on this subject a while ago but it got
caught up in other things I had to do.

When humans play a (high) handicap game, they don't estimate a high
winning percentage for the weaker player. They'll consider it to be
more or less 50-50. So to adjust the komi at the beginning of the game
such that the winning percentage becomes 50% seems a very reasonable
idea to me. This is what humans do too, they'll assume the stronger
player will be able to catch up a certain number of points to overcome
the handicap.

What seems difficult to me however is to devise a reasonable way to
decrease this komi as the game progresses. In an actual game the
stronger player catches up in leaps and bounds, not smoothly.

In MC things are not always intuitive though.

Mark
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/ 


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Don Dailey

2009/8/12 Stefan Kaitschick 

>  What a bot does with its playouts in a handicap situation is to
> essentially try to beat itself, despite the handicap.
>
> And in this situation the bot reacts in a very human way, it becomes
> despondend.
>
> Adjusting the komi dynamically shifts the goal from winning to catching up
> quickly enough.
>

I think that is the problem though.   You have only 1 thing you can
control,  how to set the komi before doing the search.But how the
program deals with your artificial (and crude) setting is unpredictable.

What you really need is some kind of way to tell it to try to win some
territory,  but not spoil your chances of winning a bit more later.   It's
easier to win N points if you know in advance that you will not be asked
later to win N more points.And I'm afraid that is what will happen all
too often - the program will maximize it's chances of winning N,  but this
does not always translate directly into winning N plus MORE.



>
> I feel that that is the natural handicap strategy, not a band-aid.
>

It's a scalability issue, which is why I call it a band-aid.   It's not
natural because it's an artificial goal, not a natural one and certainly not
the ACTUAL goal, which is to win the game.   Do you want to win N points, or
do you want to win the game?And we all KNOW that it will try to maximize
it's chances of winning N points,  regardless of the consequences beyond
that.

You would never ask a runner to stop 50 feet short of the finish line, then
ask him to go 10 feet more, and so on. The runner plans his strategy
based on the actual distance run and anything else would change his pacing
strategy in a bad way.

If the program makes decisions about the best way to win N points,   there
is no guarantee that this is ALSO the best way to win N+1 points.  This
is the implicit assumption in this strategy,  that the best way to win with
ANY komi is the same and that the same moves are just as good no matter
what.   In fact the more you must win by, the more chances you must take.

I believe the only thing wrong with the current MCTS strategy is that you
cannot get a statistical meaningful number of samples when almost all games
are won or lost.You can get more meanful NUMBER of samples by adjusting
komi,  but unfortunately you are sampling the wrong thing - an approximation
of the actual goal.

Since the approximation may be wrong or right,  your algorithm is not
scalable.   You could run on a billion processors sampling billions of nodes
per seconds and with no flaw to the search or the playouts still play a move
that gives you no chances of winning.

- Don





> Ofcourse, the dynamic komi must be adjusted down to zero in good time.
>
> I think there are 2 main reasons that this hasnt been fully explored sofar.
>
> 1. Trying to maximize the score turned out to be a huge mistake, compared
> to trying to maximize the winrate.
> This makes dynamic komi a kind of blind spot.
>
> 2. Handicap go wasnt given special attention sofar.
>
>
> Stefan
>
>
> - Original Message -
> *From:* Don Dailey 
> *To:* computer-go 
> *Sent:* Wednesday, August 12, 2009 11:24 PM
> *Subject:* Re: [computer-go] Dynamic komi at high handicaps
>
> Terry,
>
> I understand the reasoning behind this, your thought experiment did not add
> anything to my understanding. And I agree that if the program is strong
> enough and the handicap is high enough this is probably better than doing
> nothing at all.
>
> However, I think there must be something that is more along the lines of
> treating the disease, not the symptoms.You might be able to put a band
> aid on the problem but you have not addressed the real issue in a systematic
> way.
>
> Besides, I have not yet seen anyone demonstrate that this works - it's
> always talked about but never implemented.It is made to sound so simple
> that you have to wonder where the implementation is and why the strong
> programs do not have it.
>
> - Don
>
>
>
>
> 2009/8/12 terry mcintyre 
>
>>  Consider this thought experiment.
>>
>> You sit down at a board and your opponent has a 9-stone handicap.
>>
>> By any objective measure of the game, you should resign immediately.
>>
>> All your win-rate calculations report this hopeless state of affairs.
>>
>> Winrate gives you no objective basis to prefer one move or another.
>>
>> But, you think, what if I can make a small group? What if I try for a
>> lesser goal, such as "don't lose by more than 90 points?"
>>
>> Your opponent has a 9 stone handicap because he makes more mistakes than
>> you do.
>>
>> As the game progresses, those mistakes add up. You set your goal higher -
>> losing by only 50 points; losing by only 10 points.
>>
>> The changing goal permits you to discriminate in a field which would
>> otherwise look like a dark, desolate, win-less landscape.
>>
>> Terry McIntyre 
>>
>>  “We hang the petty thieves and appoint the great ones to public office.”
>> -- Aesop
>>
>>

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Stefan Kaitschick

"You are giving the program an arbitrary short term goal which may,  or may not 
be compatible with the long term goal of winning the game."

Don,

this is a very important consideration. How can an illusionary goal be better 
than the real goal?
But I would argue that in the handicap situation, catching up quickly enough is 
actually the real goal.
You write:
"And as the base program gets stronger this aspect of the program becomes more 
and more of a wart."

This I disagree with. Because no matter how strong the program will become, it 
will never find a way to defeat itself against a large handicap.
This is effectively what a program tries to do with its playouts.
The only reasonable alternative to trying to catch up quickly enough is to 
model the weaker players errors straight into the playouts, and try to find a 
direct win. But this seems more speculative to me than dynamic komi. Surely, it 
is also harder to implement well.

Stefan

  - Original Message - 
  From: Don Dailey 
  To: computer-go 
  Sent: Wednesday, August 12, 2009 11:11 PM
  Subject: Re: [computer-go] Dynamic komi at high handicaps


  The problem with MCTS programs  is that they like to consolidate.   You set 
the komi and thereby give them a goal and they very quickly make moves which 
commit to that specific goal.   Commiting to less than you need to actually win 
will often involve sacrificing chances to win.Sometime it won't,  but you 
cannot have a scalable algorithm which is this arbitrary.

  However, if the handicap is too high, the program thinks every line is a loss 
and it plays randomly.   That's why we even consider doing this.

  Dynamically changing komi could be of some benefit in that situation if there 
is no alternative reasonable strategy,   but it does not address the real 
problem - which is what I call the "committal consolidation" problem.  You 
are giving the program an arbitrary short term goal which may,  or may not be 
compatible with the long term goal of winning the game. Whether it's 
compatible or not is based on your own credulity - not anything predictible or 
that you can scale.   And as the base program gets stronger this aspect of the 
program becomes more and more of a wart.   

  If this can be made to work in the short term,  it should be considered a 
temporary hack which should be fixed as soon as possible.   

  We have to think about this anyway sooner or later because if programs 
continue to develop and the predictive ability of the playouts and tree search 
gets several hundred ELO better,  these programs may start to see more and more 
positions as either dead won or dead lost.  I'm sure we will want some kind 
of robust mechanism for dealing with this which is better at estimating chances 
that the opponent will go wrong  as opposed to doing something that is a random 
benefit or hindrance. 

  - Don


   





  2009/8/12 terry mcintyre 

Ingo suggested something interesting - instead of changing the komi 
according to the move number, or some other fixed schedule, it varies according 
to the estimated winrate. 

It also, implicitly, depends on one's guess of the ability of the opponent. 

An interesting test would be to take an opponent known to be weaker, offer 
it a handicap, and tweak the dynamic komi per Ingo's suggestion. At what 
handicap does the ratio balance at 50:50? Can the number of handicap stones be 
increased with such an adaptive algorithm?

Even better, play against a stronger opponent; can one increase the win 
rate versus strong opponents?

The usual range of computer opponents is fairly narrow. None approach 
high-dan levels on 19x19 boards - yet.


Terry McIntyre 


“We hang the petty thieves and appoint the great ones to public office.” -- 
Aesop




From: Brian Sheppard 
To: computer-go@computer-go.org
Sent: Wednesday, August 12, 2009 12:33:13 PM
Subject: [computer-go] Dynamic komi at high handicaps


>The small samples is probably the least of the problems with this.  Do you
>actually believe that you can play games against it and not be subjective
in
>your observations or how you play against it?

These are computer-vs-computer games. Ingo is manually transferring moves
between two computer opponents.

The result does support Ingo's belief that dynamic Komi will help programs
play high handicap games. Due to small sample size it isn't very strong
evidence. But maybe it is enough to induce a programmer who actually plays
in such games to create a more exhaustive test.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/




___
computer-go mailing list
computer-go@computer-go.org
http://www.c

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Don Dailey

On Wed, Aug 12, 2009 at 6:03 PM, Matthew Woodcraft
wrote:

> Don Dailey wrote:
> > The problem with MCTS programs is that they like to consolidate. You
> > set the komi and thereby give them a goal and they very quickly make
> > moves which commit to that specific goal.
>
> How did you form this opinion? Can you show an example game record
> (on 19x19) showing this behaviour?

Your kidding, right?Does anyone honestly dispute this?

I'm certainly not going to entertain this with examples - if you don't
understand this I'm sure we would waste a dozen emails arguing about it
regardless of what I could show you.

- Don

>
> -M-
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Stefan Kaitschick

What a bot does with its playouts in a handicap situation is to essentially try 
to beat itself, despite the handicap.

And in this situation the bot reacts in a very human way, it becomes despondend.

Adjusting the komi dynamically shifts the goal from winning to catching up 
quickly enough.

I feel that that is the natural handicap strategy, not a band-aid.

Ofcourse, the dynamic komi must be adjusted down to zero in good time.

I think there are 2 main reasons that this hasnt been fully explored sofar.

1. Trying to maximize the score turned out to be a huge mistake, compared to 
trying to maximize the winrate.
This makes dynamic komi a kind of blind spot.

2. Handicap go wasnt given special attention sofar.

Stefan

  - Original Message - 
  From: Don Dailey 
  To: computer-go 
  Sent: Wednesday, August 12, 2009 11:24 PM
  Subject: Re: [computer-go] Dynamic komi at high handicaps

  Terry,

  I understand the reasoning behind this, your thought experiment did not add 
anything to my understanding. And I agree that if the program is strong 
enough and the handicap is high enough this is probably better than doing 
nothing at all.

  However, I think there must be something that is more along the lines of 
treating the disease, not the symptoms.You might be able to put a band aid 
on the problem but you have not addressed the real issue in a systematic way.

  Besides, I have not yet seen anyone demonstrate that this works - it's always 
talked about but never implemented.It is made to sound so simple that you 
have to wonder where the implementation is and why the strong programs do not 
have it.

  - Don

  2009/8/12 terry mcintyre 

Consider this thought experiment.

You sit down at a board and your opponent has a 9-stone handicap.

By any objective measure of the game, you should resign immediately.

All your win-rate calculations report this hopeless state of affairs. 

Winrate gives you no objective basis to prefer one move or another.

But, you think, what if I can make a small group? What if I try for a 
lesser goal, such as "don't lose by more than 90 points?"

Your opponent has a 9 stone handicap because he makes more mistakes than 
you do. 

As the game progresses, those mistakes add up. You set your goal higher - 
losing by only 50 points; losing by only 10 points. 

The changing goal permits you to discriminate in a field which would 
otherwise look like a dark, desolate, win-less landscape. 

Terry McIntyre 

“We hang the petty thieves and appoint the great ones to public office.” -- 
Aesop

From: Don Dailey 
To: computer-go 
Sent: Wednesday, August 12, 2009 1:05:36 PM
Subject: Re: [computer-go] Dynamic komi at high handicaps

Ok,  I misunderstood his testing procedure.  What he is doing is far more 
scientific than what I thought he was doing.  

There has got to be something better than this.   What we need is a way to 
make the playouts more meaningful but not by artificially reducing our actual 
objective which is to win.

For the high handicap games,  shouldn't the goal be to maximize the score?  
 Instead of adjusting komi why not just change the goal to win as much of the 
board as possible?This would be far more honest and reliable I would think 
and the program would not be forced to constantly waste effort on constantly 
changing goals.

- Don

On Wed, Aug 12, 2009 at 3:33 PM, Brian Sheppard  wrote:

  >The small samples is probably the least of the problems with this.   Do 
you
  >actually believe that you can play games against it and not be subjective
  in
  >your observations or how you play against it?

  These are computer-vs-computer games. Ingo is manually transferring moves
  between two computer opponents.

  The result does support Ingo's belief that dynamic Komi will help programs
  play high handicap games. Due to small sample size it isn't very strong
  evidence. But maybe it is enough to induce a programmer who actually plays
  in such games to create a more exhaustive test.

  ___
  computer-go mailing list
  computer-go@computer-go.org
  http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

--

  ___
  computer-go mailing list
  computer-go@computer-go.org
  http://www.computer-go.org/mailman/listinfo/computer-go/___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/comput

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Don Dailey

On Wed, Aug 12, 2009 at 5:58 PM, Mark Boon  wrote:

> 2009/8/12 Don Dailey :
> >
> > I disagree about this being what humans do.   They do not set a fake komi
> > and then try to win only by that much.
>
> I didn't say that humans do that. I said they consider their chance
> 50-50. For an MC program to consider its chances to be 50-50 you'd
> have to up the komi. There's a difference.


If the handicap is fair,  their chance is about 50/50.   However,  rigging
komi to give the same chance is NOT what humans do.   The only thing you
said that I consider correct is that humans estimate their chances to be
about 50/50.

One thing humans do is to set short term goals and I think dynamic komi is
an attempt to do that - but it's a misguided attempt because you are setting
the WRONG short term goal. The human will have a much more specific goal
that is going to be compatible with his hope of winning the game.For
instance I am sure he will not sit merrily by and watch his opponent
consolidate a won game just so that he can have a "respectable" but losing
score.Dynamic komi of course does not address that at all.



> >
> > I think their model is somewhat incremental, trying to win a bit at a
> time
> > but I'm quite convinced that they won't just let the opponent consolidate
> > like MCTS does.   With dynamic komi the program will STILL just try to
> > consolidate and not care about what his opponent does.   But strong
> players
> > will know that letting your opponent consolidate is not going to work.
> So
> > they will keep things complicated and challenge their weaker opponents
> > everywhere that is important.
> >
>
> It's difficult to make hard claims about this. I don't agree at all
> that the stronger player constantly needs to keep things complicated.
> Personally I tend to play solidly when giving a handicap. Because most
> damage is self-inflicted. You can either make a guess what the weaker
> player doesn't know, or you can give him the initiative and he'll show
> you. I prefer the latter approach.
>
> When done properly, I don't see how an MCTS program would consolidate
> all the time. Doing so would keep the position stable while the komi
> declines. As soon as he gets behind the komi degradation curve play
> will automatically get more dynamic in an attempt to catch up.
>
> The problem is: we're speculating. The proof is in the pudding.


Agreed.

- Don



>
>
> Mark
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Matthew Woodcraft

Don Dailey wrote:
> The problem with MCTS programs is that they like to consolidate. You
> set the komi and thereby give them a goal and they very quickly make
> moves which commit to that specific goal.

How did you form this opinion? Can you show an example game record
(on 19x19) showing this behaviour?

-M-
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread terry mcintyre

In practical terms, the problem to solve is the reverse: how do we encourage 
weak programs to hang on to as much of their advantage as possible, against 
stronger players? 

In 2020, we can worry about how to beat pro players who take large handicaps 
against computer programs.

Terry McIntyre 


“We hang the petty thieves and appoint the great ones to public office.” -- 
Aesop





From: Don Dailey 
To: computer-go 
Sent: Wednesday, August 12, 2009 2:51:09 PM
Subject: Re: [computer-go] Dynamic komi at high handicaps




2009/8/12 terry mcintyre 

Most experiments are done on even games; this dynamic algorithm applies 
particularly to handicap games.In that context, it is not an ungainly kludge, 
but actually reflects the assessment of evenly matched pro players - they look 
at the board, and see a victory of n times 10 handicap stones ( or something 
roughly comparable ) for black. 
>
> 
>This matters because today's programs are not even close to playing at the pro 
>level; to win respect, they'll have to master handicap games - and to do that, 
>they'll need to do two things. First, they'll need to model the expectation 
>that black with a handicap _should_ win big. Second, they'll need to behave 
>gracefully as that initial advantage is whittled down. 

I disagree.   I think strong players have a sense of what kind of mistakes to 
expect, and try to provoke those mistakes.   Dynamic komi does not model that.  
  

It also does the opposite of making the program play provocatively, which I 
believe is necessary to beat a weaker player with a large handicap against you. 
   Instead of making it fight,  it encourages the program to be content with 
less.   How does this model strong handicap players?  

- Don



 

>
>Existing programs don't do either of those two things well. They're tuned 
>toward
> even-game strategy.
>
>
>Terry McIntyre 
>
>
>“We hang the petty thieves and appoint the great ones to public office.” -- 
>Aesop
>
>
>
>
>
>___
>>computer-go mailing list
>computer-go@computer-go.org
>http://www.computer-go.org/mailman/listinfo/computer-go/
>



  ___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Mark Boon

2009/8/12 Don Dailey :
>
> I disagree about this being what humans do.   They do not set a fake komi
> and then try to win only by that much.

I didn't say that humans do that. I said they consider their chance
50-50. For an MC program to consider its chances to be 50-50 you'd
have to up the komi. There's a difference.

>
> I think their model is somewhat incremental, trying to win a bit at a time
> but I'm quite convinced that they won't just let the opponent consolidate
> like MCTS does.   With dynamic komi the program will STILL just try to
> consolidate and not care about what his opponent does.   But strong players
> will know that letting your opponent consolidate is not going to work.    So
> they will keep things complicated and challenge their weaker opponents
> everywhere that is important.
>

It's difficult to make hard claims about this. I don't agree at all
that the stronger player constantly needs to keep things complicated.
Personally I tend to play solidly when giving a handicap. Because most
damage is self-inflicted. You can either make a guess what the weaker
player doesn't know, or you can give him the initiative and he'll show
you. I prefer the latter approach.

When done properly, I don't see how an MCTS program would consolidate
all the time. Doing so would keep the position stable while the komi
declines. As soon as he gets behind the komi degradation curve play
will automatically get more dynamic in an attempt to catch up.

The problem is: we're speculating. The proof is in the pudding.

Mark
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Don Dailey

2009/8/12 terry mcintyre 

> Most experiments are done on even games; this dynamic algorithm applies
> particularly to handicap games.In that context, it is not an ungainly
> kludge, but actually reflects the assessment of evenly matched pro players -
> they look at the board, and see a victory of n times 10 handicap stones ( or
> something roughly comparable ) for black.
>
> This matters because today's programs are not even close to playing at the
> pro level; to win respect, they'll have to master handicap games - and to do
> that, they'll need to do two things. First, they'll need to model the
> expectation that black with a handicap _should_ win big. Second, they'll
> need to behave gracefully as that initial advantage is whittled down.
>

I disagree.   I think strong players have a sense of what kind of mistakes
to expect, and try to provoke those mistakes.   Dynamic komi does not model
that.

It also does the opposite of making the program play provocatively, which I
believe is necessary to beat a weaker player with a large handicap against
you.Instead of making it fight,  it encourages the program to be content
with less.   How does this model strong handicap players?

- Don





>
>
> Existing programs don't do either of those two things well. They're tuned
> toward even-game strategy.
>
> Terry McIntyre 
>
> “We hang the petty thieves and appoint the great ones to public office.” --
> Aesop
>
>
>
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Don Dailey

On Wed, Aug 12, 2009 at 5:36 PM, Mark Boon  wrote:

> I started to write something on this subject a while ago but it got
> caught up in other things I had to do.
>
> When humans play a (high) handicap game, they don't estimate a high
> winning percentage for the weaker player. They'll consider it to be
> more or less 50-50. So to adjust the komi at the beginning of the game
> such that the winning percentage becomes 50% seems a very reasonable
> idea to me. This is what humans do too, they'll assume the stronger
> player will be able to catch up a certain number of points to overcome
> the handicap.

I disagree about this being what humans do.   They do not set a fake komi
and then try to win only by that much.

I think their model is somewhat incremental, trying to win a bit at a time
but I'm quite convinced that they won't just let the opponent consolidate
like MCTS does.   With dynamic komi the program will STILL just try to
consolidate and not care about what his opponent does.   But strong players
will know that letting your opponent consolidate is not going to work.So
they will keep things complicated and challenge their weaker opponents
everywhere that is important.

- Don

>
>
> What seems difficult to me however is to devise a reasonable way to
> decrease this komi as the game progresses. In an actual game the
> stronger player catches up in leaps and bounds, not smoothly.
>
> In MC things are not always intuitive though.
>
> Mark
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread terry mcintyre

Most experiments are done on even games; this dynamic algorithm applies 
particularly to handicap games.In that context, it is not an ungainly kludge, 
but actually reflects the assessment of evenly matched pro players - they look 
at the board, and see a victory of n times 10 handicap stones ( or something 
roughly comparable ) for black. 

 
This matters because today's programs are not even close to playing at the pro 
level; to win respect, they'll have to master handicap games - and to do that, 
they'll need to do two things. First, they'll need to model the expectation 
that black with a handicap _should_ win big. Second, they'll need to behave 
gracefully as that initial advantage is whittled down. 

Existing programs don't do either of those two things well. They're tuned 
toward even-game strategy.

Terry McIntyre 


“We hang the petty thieves and appoint the great ones to public office.” -- 
Aesop


  ___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Mark Boon

I started to write something on this subject a while ago but it got
caught up in other things I had to do.

When humans play a (high) handicap game, they don't estimate a high
winning percentage for the weaker player. They'll consider it to be
more or less 50-50. So to adjust the komi at the beginning of the game
such that the winning percentage becomes 50% seems a very reasonable
idea to me. This is what humans do too, they'll assume the stronger
player will be able to catch up a certain number of points to overcome
the handicap.

What seems difficult to me however is to devise a reasonable way to
decrease this komi as the game progresses. In an actual game the
stronger player catches up in leaps and bounds, not smoothly.

In MC things are not always intuitive though.

Mark
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread terry mcintyre

Some algorithms are special-purpose by nature. What I sketched is an 
approximation of my understanding of how strong players defeat weaker players 
with large handicaps. When Myungwan Kim faced off against MFG a few days ago, 
with a 7 stone handicap, he had to come up with a strategy which would 
ultimately win a theoretically unwinnable game. 

When two pros face off, and one has a 7 stone handicap, the expectation is that 
the one with the handicap will not merely win, but win by 70 points. A pro with 
such a large handicap would be satisfied with nothing less. The Nihon Ki'in has 
published a book of pro-pro handicap games, where black exploits the full power 
of the handicap stones to give white a very thorough drubbing.

 
David Fotland might have some insights into how MFG viewed that game. Perhaps 
MFG thought it was so far ahead that it was indifferent about the various 
opening moves. Who cares if the win rate is 99.9998 or 99.7? But there are 
differences, which weaker players use to hang on to as much of their advantage 
as possible, and stronger players use to wear down that advantage. It becomes a 
war of attrition - whoever runs out of troops or ammo first loses the war. 

Terry McIntyre 


“We hang the petty thieves and appoint the great ones to public office.” -- 
Aesop




From: Don Dailey 
To: computer-go 
Sent: Wednesday, August 12, 2009 2:11:58 PM
Subject: Re: [computer-go] Dynamic komi at high handicaps

The problem with MCTS programs  is that they like to consolidate.   You set the 
komi and thereby give them a goal and they very quickly make moves which commit 
to that specific goal.   Commiting to less than you need to actually win will 
often involve sacrificing chances to win.Sometime it won't,  but you cannot 
have a scalable algorithm which is this arbitrary.

However, if the handicap is too high, the program thinks every line is a loss 
and it plays randomly.   That's why we even consider doing this.

Dynamically changing komi could be of some benefit in that situation if there 
is no alternative reasonable strategy,   but it does not address the real 
problem - which is what I call the "committal consolidation" problem.  You 
are giving the program an arbitrary short term goal which may,  or may not be 
compatible with the long term goal of winning the game. Whether it's 
compatible or not is based on your own credulity - not anything predictible or 
that you can scale.   And as the base program gets stronger this aspect of the 
program becomes more and more of a wart.   

If this can be made to work in the short term,  it should be considered a 
temporary hack which should be fixed as soon as possible.   

We have to think about this anyway sooner or later because if programs continue 
to develop and the predictive ability of the playouts and tree search gets 
several hundred ELO better,  these programs may start to see more and more 
positions as either dead won or dead lost.  I'm sure we will want some kind 
of robust mechanism for dealing with this which is better at estimating chances 
that the opponent will go wrong  as opposed to doing something that is a random 
benefit or hindrance. 

- Don


 





2009/8/12 terry mcintyre 

Ingo suggested something interesting - instead of changing the komi according 
to the move number, or some other fixed schedule, it varies according to the 
estimated winrate. 
>
>It also, implicitly, depends on one's guess of the ability of the opponent. 
>
>An interesting test would be to take an opponent known to be weaker, offer it 
>a handicap, and tweak the dynamic komi per Ingo's suggestion. At what handicap 
>does the ratio balance at 50:50? Can the number of handicap stones be 
>increased with such an adaptive algorithm?
>
>Even better, play against a stronger opponent; can one increase the win rate 
>versus strong opponents?
>
>The usual range of computer opponents is fairly narrow. None approach high-dan 
>levels on 19x19 boards - yet.
>
> Terry McIntyre
> 
>
>
>“We hang the petty thieves and appoint the great ones to public office.” -- 
>Aesop
>
>
>

From: Brian Sheppard 
>To: computer-go@computer-go.org
>Sent: Wednesday, August 12, 2009 12:33:13 PM
>Subject: [computer-go] Dynamic komi at high handicaps
>
>
>>>The small samples is probably the least of the problems with this.   Do you
>>actually believe that you can play games against it and not be subjective
>in
>>your observations or how you play against it?
>
>These are computer-vs-computer games. Ingo is manually transferring moves
>between two computer opponents.
>
>The result does support Ingo's belief that dynamic Komi will help programs
>play high handicap games. Due to small sample size it isn't very strong
>>evidence. But maybe it is enough to induce a programmer who actually plays
>in such games to create a more exhaustive test.
>
>___
>c

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Ivan Dubois

I 100% agree with Don, dynamic komi just cant be the right approach in my 
opinion.

One idea I just have is this : 
In the tree search part, instead of using a rule wich converges to MAX, use a 
rule wich converges to alpha*MAX + beta*AVERAGE. Do this only for plies where 
it is the weaker player turn (the player who benefits from handicap stones)
When beta is high, it may simulate the fact that the weak player cant actualy 
read out sequences reliabily, thus increasing the chances of succes of the 
stronger player. 
Just a wild guess anyway...


- Original Message - 
  From: Don Dailey 
  To: computer-go 
  Sent: Wednesday, August 12, 2009 11:11 PM
  Subject: Re: [computer-go] Dynamic komi at high handicaps


  The problem with MCTS programs  is that they like to consolidate.   You set 
the komi and thereby give them a goal and they very quickly make moves which 
commit to that specific goal.   Commiting to less than you need to actually win 
will often involve sacrificing chances to win.Sometime it won't,  but you 
cannot have a scalable algorithm which is this arbitrary.

  However, if the handicap is too high, the program thinks every line is a loss 
and it plays randomly.   That's why we even consider doing this.

  Dynamically changing komi could be of some benefit in that situation if there 
is no alternative reasonable strategy,   but it does not address the real 
problem - which is what I call the "committal consolidation" problem.  You 
are giving the program an arbitrary short term goal which may,  or may not be 
compatible with the long term goal of winning the game. Whether it's 
compatible or not is based on your own credulity - not anything predictible or 
that you can scale.   And as the base program gets stronger this aspect of the 
program becomes more and more of a wart.   

  If this can be made to work in the short term,  it should be considered a 
temporary hack which should be fixed as soon as possible.   

  We have to think about this anyway sooner or later because if programs 
continue to develop and the predictive ability of the playouts and tree search 
gets several hundred ELO better,  these programs may start to see more and more 
positions as either dead won or dead lost.  I'm sure we will want some kind 
of robust mechanism for dealing with this which is better at estimating chances 
that the opponent will go wrong  as opposed to doing something that is a random 
benefit or hindrance. 

  - Don


   





  2009/8/12 terry mcintyre 

Ingo suggested something interesting - instead of changing the komi 
according to the move number, or some other fixed schedule, it varies according 
to the estimated winrate. 

It also, implicitly, depends on one's guess of the ability of the opponent. 

An interesting test would be to take an opponent known to be weaker, offer 
it a handicap, and tweak the dynamic komi per Ingo's suggestion. At what 
handicap does the ratio balance at 50:50? Can the number of handicap stones be 
increased with such an adaptive algorithm?

Even better, play against a stronger opponent; can one increase the win 
rate versus strong opponents?

The usual range of computer opponents is fairly narrow. None approach 
high-dan levels on 19x19 boards - yet.


Terry McIntyre 


“We hang the petty thieves and appoint the great ones to public office.” -- 
Aesop




From: Brian Sheppard 
To: computer-go@computer-go.org
Sent: Wednesday, August 12, 2009 12:33:13 PM
Subject: [computer-go] Dynamic komi at high handicaps


>The small samples is probably the least of the problems with this.  Do you
>actually believe that you can play games against it and not be subjective
in
>your observations or how you play against it?

These are computer-vs-computer games. Ingo is manually transferring moves
between two computer opponents.

The result does support Ingo's belief that dynamic Komi will help programs
play high handicap games. Due to small sample size it isn't very strong
evidence. But maybe it is enough to induce a programmer who actually plays
in such games to create a more exhaustive test.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/




___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/





--


  ___
  computer-go mailing list
  computer-go@computer-go.org
  http://www.computer-go.org/mailman/listinfo/computer-go/___
computer-go mailing list
computer-go@computer-go.org
http:/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Don Dailey

Terry,

I understand the reasoning behind this, your thought experiment did not add
anything to my understanding. And I agree that if the program is strong
enough and the handicap is high enough this is probably better than doing
nothing at all.

However, I think there must be something that is more along the lines of
treating the disease, not the symptoms.You might be able to put a band
aid on the problem but you have not addressed the real issue in a systematic
way.

Besides, I have not yet seen anyone demonstrate that this works - it's
always talked about but never implemented.It is made to sound so simple
that you have to wonder where the implementation is and why the strong
programs do not have it.

- Don




2009/8/12 terry mcintyre 

> Consider this thought experiment.
>
> You sit down at a board and your opponent has a 9-stone handicap.
>
> By any objective measure of the game, you should resign immediately.
>
> All your win-rate calculations report this hopeless state of affairs.
>
> Winrate gives you no objective basis to prefer one move or another.
>
> But, you think, what if I can make a small group? What if I try for a
> lesser goal, such as "don't lose by more than 90 points?"
>
> Your opponent has a 9 stone handicap because he makes more mistakes than
> you do.
>
> As the game progresses, those mistakes add up. You set your goal higher -
> losing by only 50 points; losing by only 10 points.
>
> The changing goal permits you to discriminate in a field which would
> otherwise look like a dark, desolate, win-less landscape.
>
> Terry McIntyre 
>
> “We hang the petty thieves and appoint the great ones to public office.” --
> Aesop
>
> --
> *From:* Don Dailey 
> *To:* computer-go 
> *Sent:* Wednesday, August 12, 2009 1:05:36 PM
> *Subject:* Re: [computer-go] Dynamic komi at high handicaps
>
> Ok,  I misunderstood his testing procedure.  What he is doing is far more
> scientific than what I thought he was doing.
>
> There has got to be something better than this.   What we need is a way to
> make the playouts more meaningful but not by artificially reducing our
> actual objective which is to win.
>
> For the high handicap games,  shouldn't the goal be to maximize the
> score?   Instead of adjusting komi why not just change the goal to win as
> much of the board as possible?This would be far more honest and reliable
> I would think and the program would not be forced to constantly waste effort
> on constantly changing goals.
>
>
> - Don
>
>
>
>
>
> On Wed, Aug 12, 2009 at 3:33 PM, Brian Sheppard wrote:
>
>> >The small samples is probably the least of the problems with this.   Do
>> you
>> >actually believe that you can play games against it and not be subjective
>> in
>> >your observations or how you play against it?
>>
>> These are computer-vs-computer games. Ingo is manually transferring moves
>> between two computer opponents.
>>
>> The result does support Ingo's belief that dynamic Komi will help programs
>> play high handicap games. Due to small sample size it isn't very strong
>> evidence. But maybe it is enough to induce a programmer who actually plays
>> in such games to create a more exhaustive test.
>>
>> ___
>> computer-go mailing list
>> computer-go@computer-go.org
>> http://www.computer-go.org/mailman/listinfo/computer-go/
>>
>
>
>
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Don Dailey

The problem with MCTS programs  is that they like to consolidate.   You set
the komi and thereby give them a goal and they very quickly make moves which
commit to that specific goal.   Commiting to less than you need to actually
win will often involve sacrificing chances to win.Sometime it won't,
but you cannot have a scalable algorithm which is this arbitrary.

However, if the handicap is too high, the program thinks every line is a
loss and it plays randomly.   That's why we even consider doing this.

Dynamically changing komi could be of some benefit in that situation if
there is no alternative reasonable strategy,   but it does not address the
real problem - which is what I call the "committal consolidation"
problem.  You are giving the program an arbitrary short term goal which
may,  or may not be compatible with the long term goal of winning the
game. Whether it's compatible or not is based on your own credulity -
not anything predictible or that you can scale.   And as the base program
gets stronger this aspect of the program becomes more and more of a wart.

If this can be made to work in the short term,  it should be considered a
temporary hack which should be fixed as soon as possible.

We have to think about this anyway sooner or later because if programs
continue to develop and the predictive ability of the playouts and tree
search gets several hundred ELO better,  these programs may start to see
more and more positions as either dead won or dead lost.  I'm sure we
will want some kind of robust mechanism for dealing with this which is
better at estimating chances that the opponent will go wrong  as opposed to
doing something that is a random benefit or hindrance.

- Don







2009/8/12 terry mcintyre 

> Ingo suggested something interesting - instead of changing the komi
> according to the move number, or some other fixed schedule, it varies
> according to the estimated winrate.
>
> It also, implicitly, depends on one's guess of the ability of the opponent.
>
>
> An interesting test would be to take an opponent known to be weaker, offer
> it a handicap, and tweak the dynamic komi per Ingo's suggestion. At what
> handicap does the ratio balance at 50:50? Can the number of handicap stones
> be increased with such an adaptive algorithm?
>
> Even better, play against a stronger opponent; can one increase the win
> rate versus strong opponents?
>
> The usual range of computer opponents is fairly narrow. None approach
> high-dan levels on 19x19 boards - yet.
>
> Terry McIntyre 
>
> “We hang the petty thieves and appoint the great ones to public office.” --
> Aesop
> --
> *From:* Brian Sheppard 
> *To:* computer-go@computer-go.org
> *Sent:* Wednesday, August 12, 2009 12:33:13 PM
> *Subject:* [computer-go] Dynamic komi at high handicaps
>
> >The small samples is probably the least of the problems with this.  Do you
> >actually believe that you can play games against it and not be subjective
> in
> >your observations or how you play against it?
>
> These are computer-vs-computer games. Ingo is manually transferring moves
> between two computer opponents.
>
> The result does support Ingo's belief that dynamic Komi will help programs
> play high handicap games. Due to small sample size it isn't very strong
> evidence. But maybe it is enough to induce a programmer who actually plays
> in such games to create a more exhaustive test.
>
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
>
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread dhillismail


I think Terry's suggestion is the best way to test these ideas:



1) Take 2 severely mismatched engines (perhaps 2 versions of the same engine 
but with different numbers of playouts.)

2) Find the fair handicap by playing a sequence of games and adjusting the 
number of handicap stones whenever one side loses N out of M games.

3) Plot the handicap over time-it should converge, more or less.

4) Keeping one engine fixed, adjust the other engine, using dynamic Komi, or 
whatever you think is the best way, and see how much you can improve on the 
handicap.



- Dave Hillis

-Original Message-
From: terry mcintyre 
To: computer-go 
Sent: Wed, Aug 12, 2009 3:42 pm
Subject: Re: [computer-go] Dynamic komi at high handicaps




Ingo suggested something interesting - instead of changing the komi according 
to the move number, or some other fixed schedule, it varies according to the 
estimated winrate. 

It also, implicitly, depends on one's guess of the ability of the opponent. 

An interesting test would be to take an opponent known to be weaker, offer it a 
handicap, and tweak the dynamic komi per Ingo's suggestion. At what handicap 
does the ratio balance at 50:50? Can the number of handicap stones be increased 
with such an adaptive algorithm?

Even better, play against a stronger opponent; can one increase the win rate 
versus strong opponents?

The usual range of computer opponents is fairly narrow. None approach high-dan 
levels on2019x19 boards - yet.

 
Terry McIntyre 


“We hang the petty thieves and appoint the great ones to public office.” -- 
Aesop


From: Brian Sheppard 
To: computer-go@computer-go.org
Sent: Wednesday, August 12, 2009 12:33:13 PM
Subject: [computer-go] Dynamic komi at high handicaps

>The small samples is probably the least of the problems with this.  Do you
>actually believe that you can play games against it and not be subjective
in
>your observations or how you play against it?

These are computer-vs-computer games. Ingo is manually transferring moves
between two computer opponents.

The result does support Ingo's belief that dynamic Komi will help programs
play high handicap games. Due to small sample size it isn't very strong
evidence. But maybe it is enough to induce a programmer who actually plays
in such games to create a more exhaustive test.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/








___
omputer-go mailing list
omputer...@computer-go.org
ttp://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread terry mcintyre

Consider this thought experiment.

You sit down at a board and your opponent has a 9-stone handicap.

By any objective measure of the game, you should resign immediately.

All your win-rate calculations report this hopeless state of affairs. 

Winrate gives you no objective basis to prefer one move or another.

But, you think, what if I can make a small group? What if I try for a lesser 
goal, such as "don't lose by more than 90 points?"

Your opponent has a 9 stone handicap because he makes more mistakes than you 
do. 

As the game progresses, those mistakes add up. You set your goal higher - 
losing by only 50 points; losing by only 10 points. 

The changing goal permits you to discriminate in a field which would otherwise 
look like a dark, desolate, win-less landscape.

Terry McIntyre 

“We hang the petty thieves and appoint the great ones to public office.” -- 
Aesop

From: Don Dailey 
To: computer-go 
Sent: Wednesday, August 12, 2009 1:05:36 PM
Subject: Re: [computer-go] Dynamic komi at high handicaps

Ok,  I misunderstood his testing procedure.  What he is doing is far more 
scientific than what I thought he was doing.  

There has got to be something better than this.   What we need is a way to make 
the playouts more meaningful but not by artificially reducing our actual 
objective which is to win.

For the high handicap games,  shouldn't the goal be to maximize the score?   
Instead of adjusting komi why not just change the goal to win as much of the 
board as possible?This would be far more honest and reliable I would think 
and the program would not be forced to constantly waste effort on constantly 
changing goals.

- Don

On Wed, Aug 12, 2009 at 3:33 PM, Brian Sheppard  wrote:

>>The small samples is probably the least of the problems with this.   Do you
>>>actually believe that you can play games against it and not be subjective
>>in
>>>your observations or how you play against it?
>
>>These are computer-vs-computer games. Ingo is manually transferring moves
>>between two computer opponents.
>
>>The result does support Ingo's belief that dynamic Komi will help programs
>>play high handicap games. Due to small sample size it isn't very strong
>>evidence. But maybe it is enough to induce a programmer who actually plays
>>in such games to create a more exhaustive test.
>
>>___
>>computer-go mailing list
>computer-go@computer-go.org
>http://www.computer-go.org/mailman/listinfo/computer-go/
>

  ___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Don Dailey

Ok,  I misunderstood his testing procedure.  What he is doing is far more
scientific than what I thought he was doing.

There has got to be something better than this.   What we need is a way to
make the playouts more meaningful but not by artificially reducing our
actual objective which is to win.

For the high handicap games,  shouldn't the goal be to maximize the score?
Instead of adjusting komi why not just change the goal to win as much of the
board as possible?This would be far more honest and reliable I would
think and the program would not be forced to constantly waste effort on
constantly changing goals.

- Don

On Wed, Aug 12, 2009 at 3:33 PM, Brian Sheppard  wrote:

> >The small samples is probably the least of the problems with this.   Do
> you
> >actually believe that you can play games against it and not be subjective
> in
> >your observations or how you play against it?
>
> These are computer-vs-computer games. Ingo is manually transferring moves
> between two computer opponents.
>
> The result does support Ingo's belief that dynamic Komi will help programs
> play high handicap games. Due to small sample size it isn't very strong
> evidence. But maybe it is enough to induce a programmer who actually plays
> in such games to create a more exhaustive test.
>
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread terry mcintyre

Ingo suggested something interesting - instead of changing the komi according 
to the move number, or some other fixed schedule, it varies according to the 
estimated winrate. 

It also, implicitly, depends on one's guess of the ability of the opponent. 

An interesting test would be to take an opponent known to be weaker, offer it a 
handicap, and tweak the dynamic komi per Ingo's suggestion. At what handicap 
does the ratio balance at 50:50? Can the number of handicap stones be increased 
with such an adaptive algorithm?

Even better, play against a stronger opponent; can one increase the win rate 
versus strong opponents?

The usual range of computer opponents is fairly narrow. None approach high-dan 
levels on 19x19 boards - yet.

 Terry McIntyre 


“We hang the petty thieves and appoint the great ones to public office.” -- 
Aesop




From: Brian Sheppard 
To: computer-go@computer-go.org
Sent: Wednesday, August 12, 2009 12:33:13 PM
Subject: [computer-go] Dynamic komi at high handicaps

>The small samples is probably the least of the problems with this.   Do you
>actually believe that you can play games against it and not be subjective
in
>your observations or how you play against it?

These are computer-vs-computer games. Ingo is manually transferring moves
between two computer opponents.

The result does support Ingo's belief that dynamic Komi will help programs
play high handicap games. Due to small sample size it isn't very strong
evidence. But maybe it is enough to induce a programmer who actually plays
in such games to create a more exhaustive test.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/



  ___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

[computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Brian Sheppard

>The small samples is probably the least of the problems with this.   Do you
>actually believe that you can play games against it and not be subjective
in
>your observations or how you play against it?

These are computer-vs-computer games. Ingo is manually transferring moves
between two computer opponents.

The result does support Ingo's belief that dynamic Komi will help programs
play high handicap games. Due to small sample size it isn't very strong
evidence. But maybe it is enough to induce a programmer who actually plays
in such games to create a more exhaustive test.

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Don Dailey

2009/8/12 "Ingo Althöfer" <3-hirn-ver...@gmx.de>

> In the last few weeks I have experimented a lot with dynamic
> komi in games with high handicap. Especially, I used the
> really nice commercial program Many Faces of Go (version 12.013)
> with its Monte Carlo level (about 2 kyu on 19x19 board) and
> its traditional 18-kyu level as the opponent.
>
> At handicap 21 I played (manulally!) 8 games with these opponents:
> 4 games with static komi (0.5) - here MFoG (2-kyu) won 1 of the 4 games.
> 4 games with dynamic komi - here MFoG (2-kyu) won 3 of the 4 games.
>
> I used "dynamic komi" in the following "Rule 42" way. Starting point for
> this internal artificial komi was a very high value (to compensate for
> the handicap stones), typically 300.5 or 320.5 .
> Then, always when the evaluation had climbed up to 42 % or higher,
> dynamic komi was reduced by 50 or 30 or 20 (or 10 near the end),
> until finally the true value of 0.5 was reached.
>
> After this little sample I also tried a few games with dynamic komi
> at handicap 25. After some unsuccessful games (the Monte Carlo side
> died of starvation at komi=40.5 or 30.5) today one win came out:
> In best Monte Carlo fashion, the MC-level won by half a point.
>
> I have included sgf of this game.
>
> I am aware that small samples are not enough to prove something.
> Therefore, I hope that programmers may realize automatic versions
> of something like "Rule 42" to find out how their programs behave
> with dynamic komi.


The small samples is probably the least of the problems with this.   Do you
actually believe that you can play games against it and not be subjective in
your observations or how you play against it?


- Don




>
> Ingo.
> --
> GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
> Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01
>
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

[computer-go] Dynamic komi at high handicaps

2009-08-12 Thread Ingo Althöfer

In the last few weeks I have experimented a lot with dynamic
komi in games with high handicap. Especially, I used the
really nice commercial program Many Faces of Go (version 12.013)
with its Monte Carlo level (about 2 kyu on 19x19 board) and
its traditional 18-kyu level as the opponent.

At handicap 21 I played (manulally!) 8 games with these opponents:
4 games with static komi (0.5) - here MFoG (2-kyu) won 1 of the 4 games.
4 games with dynamic komi - here MFoG (2-kyu) won 3 of the 4 games.

I used "dynamic komi" in the following "Rule 42" way. Starting point for 
this internal artificial komi was a very high value (to compensate for 
the handicap stones), typically 300.5 or 320.5 .
Then, always when the evaluation had climbed up to 42 % or higher,
dynamic komi was reduced by 50 or 30 or 20 (or 10 near the end),
until finally the true value of 0.5 was reached.

After this little sample I also tried a few games with dynamic komi
at handicap 25. After some unsuccessful games (the Monte Carlo side
died of starvation at komi=40.5 or 30.5) today one win came out:
In best Monte Carlo fashion, the MC-level won by half a point.

I have included sgf of this game.

I am aware that small samples are not enough to prove something.
Therefore, I hope that programmers may realize automatic versions
of something like "Rule 42" to find out how their programs behave
with dynamic komi.

Ingo.
-- 
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01


handicap25-dynamicKomi.sgf
Description: application/go-sgf
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] new kid on the block

2009-08-12 Thread Don Dailey

>
>
> Yes, known problem :-( I'm still trying to find a method to see if a
> point is in an eye. Should not be too difficult in theory but in
> practice i have not found a method yet.


Are you talking about 1 point eyes? For this I think most programs use
the same definition, which is quite good and safe.   As far as I know there
is no perfect rule but this is close to perfect.

The definition of an eye we use is this:

An empty point whose direct neighbors are all of the same color AND whose
diagonal neighbors contain no more than 1 stone of the opposite color.

This definition must be modified slightly if the point in question is on the
edge of the board - in which case there must be NO diagonal enemy stones.

To know if a point is inside a bigger eye - that's much more speculative I
think.

- Don







>
> --
> Multi tail barnamaj mowahib li mora9abat attasjilat wa nataij awamir
> al 7asoub. damj, talwin, mora9abat attarchi7 wa ila akhirih.
> http://www.vanheusden.com/multitail/
> --
> Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] new kid on the block

2009-08-12 Thread Folkert van Heusden

> Congrats for breaking the 1000 elo mark on cgos. ;)

Thanks!
Version 0.5 made quiet a difference compared to version 0.4.
I'm graphing the elo ratings of the versions running at cgos here:
http://keetweej.vanheusden.com/stats/stop-all-elo-cgos.png

> Some things I noticed when watching 2 games:

> -stop plays on the first line/corner in the beginning. maybe this helps: 
> http://computer-go.org/pipermail/computer-go/2008-December/017340.html
> or this: 
> http://computer-go.org/pipermail/computer-go/2008-December/017457.html

Ok, will read that.

> -stop fills its own eyes, killing alive groups. you should prevent moves 
> that fill own eyes. look here: 
> http://computer-go.org/pipermail/computer-go/2008-May/014929.html

Yes, known problem :-( I'm still trying to find a method to see if a
point is in an eye. Should not be too difficult in theory but in
practice i have not found a method yet.
After a side puts a stone on a cross, I collect all stones and put those
in seperate arrays; each chain in a seperate array. Now what I should do
is finding out if such a chain makes a circle and then with a simple
is-a-point-in-a-polygon-algorithm check if the point is in an eye. Still
failing on that.
Any tips are welcome!


Folkert van Heusden

-- 
Multi tail barnamaj mowahib li mora9abat attasjilat wa nataij awamir
al 7asoub. damj, talwin, mora9abat attarchi7 wa ila akhirih.
http://www.vanheusden.com/multitail/
--
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] new kid on the block

2009-08-12 Thread Isaac Deutsch

Congrats for breaking the 1000 elo mark on cgos. ;) Some things I  
noticed when watching 2 games:


-stop plays on the first line/corner in the beginning. maybe this  
helps: http://computer-go.org/pipermail/computer-go/2008-December/017340.html

or this: http://computer-go.org/pipermail/computer-go/2008-December/017457.html
-stop fills its own eyes, killing alive groups. you should prevent  
moves that fill own eyes. look here: http://computer-go.org/pipermail/computer-go/2008-May/014929.html


regards,
ibd
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

39 matches

Mail list logo