On Fri, Jul 23, 2010 at 3:54 PM, Raymond Wold <[email protected]>wrote:
> On 23.07.2010 02:12, Stefan Kaitschick wrote:
>
>> Why should the worst case be the most interesting?
>> In a program of this complexity worst case isn't the "true" strength of
>> the program.
>> Worst case is basically a bug.
>>
>
> Given enough time, even an "AI" that chooses entirely randomly from the
> legal moves available will get a win against the best human.
I'm trying to follow this conversation, but I don't get your point on this.
>
> What's wrong with looking average play to judge the program?
>>
>
> Well, it depends on how you measure the average. The typical, putting your
> bot on a go server and letting it play self-selected humans, is not very
> good, as surprise at playing style, no knowledge of the program's
> fundamental flaws, and so on, (not to mention humans not necessarily taking
> a game against a bot as serious) will bias it towards the program.
>
Playing strength is a function of everything you know and don't know and all
your strengths and weaknesses and how you put them together. Every
player, even the very best has weaknesses but those do not determine how
strong he is.
I think your cognitive mental model of how this works is broken.
I used to believe that a chain is as strong as it's weakest link - using
this analogy to improve my tennis game. But a pro I was friends with
taught me a pretty valuable lesson. He told me that my weaknesses do not
have to define my game and he taught me how to be aggressive with my
strengths and thus minimize the weaknesses. He gave me several examples
of top pro's that were far from the best at certain things, but played in
such a way that is was only a minor issue and they reached the very top
ranks. Then I remembered that a chess master basically told me the same
thing.
> I would not object to an average of, say, 100 games against one human
> opponent trying his best to win. With an even result under such a series I
> would certainly consider the program as strong as the human.
Within 2 or 3 games the human as learned 80% of what he needs to know.
Since most players who will play your program have already played many
others that have the same basic characteristics, my conclusion is that
computer strength as measured by ELO in KGS games is accurate because they
have already taken the hit for this particular weakness.
Your idea that there is some kind of break even point around game 100 is
completely ludicrous.
Basically your idea of fair is that the first few games shouldn't count -
you just said it differently and it's a ridiculous idea.
I cannot tell you how often I have played some opponent that I could not
beat the first few times until I leaned how to play him (in chess) and it
has also happened just the opposite for me where I seemed to win easily but
it was clear that my opponent was studying and learning. Your comments
suggest the first few games are invalid in ANY encounter.
>
> And in terms of "interesting" I must say that I find the programs best
>> play much more interesting than it's worst play.
>> With best play I don't mean some book play ofcourse, but a fine solution
>> to a tricky problem.
>>
>
> "Tricky problems" is what a computer does best, a localized search for a
> solution, possibly even brute forced. This isn't very impressive to me.
You are clearly anti-computer and your comments seem to reflect a kind of
emotional prejudice instead of logic - for instance thinking that it's fair
to set up matches in such a way that the computers weaknesses are
exaggerated even more. The computers inability to lean is already a
handicap right from the first move.
>
> Granted, truly awesome play is currently mostly to be seen on 9*9.
>> But I've seen some great kills on the big board that any top amateur could
>> be proud of.
>>
>
> And how do you deal with confirmation bias? If you look for exceptional
> results, do you also look for spectacular failures? What about if a program
> gets an occasional brilliant win, but still loses most of the games?
There is a system that averages the two, it's called the ELO system, or if
you prefer the Go ranking system. The spectacular failures will be
reflected in the numbers.
You remind me of the man who ties someones hands behind his back and then
fights him. When the handicapped man bites you, you complain that he is
not fighting fair.
If you set up the match right, you can give either player an advantage but I
for one would seriously not trust the results of such manipulation. I
think you need some kind of reasonable justification for setting up
match conditions such that one player has every possible advantage.
So I have a suggestion. Let's play the match at the rate of game in 1
minute. I hold the human to higher standards and it doesn't seem fair to
me to set time controls in such as way that the human has such an easy game
of it. That's just not fair and it inflates the ELO of the human player.
Your counter argument better not be that this is he accepted time
control for human players - of course it is.
Don
>
> _______________________________________________
> Computer-go mailing list
> [email protected]
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
>
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go