Raymond Wold wrote: >> Playing strength is a function of everything you know and don't know >> and all your strengths and weaknesses and how you put them together. >> Every player, even the very best has weaknesses but those do not >> determine how strong he is. > > These two sentences contradict eachother to me. If a program plays as a > 9dan professional as long as there are never any ladders, but has a bug > that flips the status of every ladder, then that weakness /does/ > determine how strong he is. As incredible as it would be to make such a > program, it would not deserve to be classed along with the human 9dan > professionals, it would probably be more correct to consider it double > digit kyu,
This makes no sense. The rating of the program is the strenght given this weakness. If this turns out to be 9dan pro despite laddder misreading, for example because it turns out to be very hard to set up game-result-defining ladders, the program is 9dan pro. There is a tendency for humans to rate computers according to the flaws they see they can understand. If a 10kyu sees a program making a mistake he understands and could avoid, he will think the program is worse than 10kyu. You are also falling into this trap by giving that example. > It 's entirely conceivable that such a fault could go undetected at > least for a few games, or that the program could get wins against > professionals that didn't know about it. So what? Then the program is stronger than the professionals until they fix their weaknesses and improve to hit upon the ones of the program. Strength is a relative, not an absolute measure. >> Basically your idea of fair is that the first few games shouldn't >> count - you just said it differently and it's a ridiculous idea. > > That is indeed what I am saying, and I don't think it is so ridiculous. You are saying that a known weakness of human players is that they need warmup time and that they should not be rated according to their weakest performance, which is coincidentally what you are arguing *against*. Cherry-picking results is never a valid method, regardless of the side arguing. > A human can much easier detect and correct his flaws, especially when > they are being exploited. Thus the effect of weaknesses isn't as > important for humans. A weakness of human chess players against computers is that they don't perform well in very fast timecontrols, which lead GMs to lose to computers even when the latter were far behind in playing strength on slow timecontrols. I'm curious what you suggest to fix this human weaknesses. Another weakness is that they have problems reliably visualizing positions 20 ply out and identifying all the tactical possibilities there, and backtracking that to the current position. The nasty computers exploit this in almost every game. I'm also curious how you suggest to adapt to that. > If I am anti-anything, it would be against bias in program authors and > testers. I am for intellectual honesty. If you think a program should be > compensated in its ranking for the handicap that it can't learn, you > should give the program a much higher ranking than it would get on a go > server. As already explained, this argument works perfectly well the other way around (warmup time for humans and you wanting to drop the first games). If you think you are unbiased or intellectually honest when making such an argument, you're fooling yourself. > I want programs to improve under /my/ standard. I am kind of hoping > other go coders feel the same. Isn't handling the hard problem of > playing go what this is all about, and not just getting a high rating > for kudos and commercial gain? Surely tackling what your program is > worst at does this the best? Handling the hard problem of go means maximizing playing strength. That *is* my interest, and this does *NOT* entail fixing every possible weakness. Practise has demonstrated this convincingly for Go, for chess, and for other games. The strength of a program is *NOT* solely determined by its weakest part. Your "commercial gain" argument is very lame and silly. If this were the interest, fixing the weaknesses would be more important than making the program play well. To understand why, see the second paragraph of this mail. -- GCP _______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
