Oh yes, the graphs are still there: http://cgos.boardspace.net/study/
http://cgos.boardspace.net/study/13/ - Don On Thu, 2008-08-28 at 10:10 -0400, Don Dailey wrote: > On Thu, 2008-08-28 at 09:38 +0200, Rémi Coulom wrote: > > Don Dailey wrote: > > > I don't really believe the ELO model is "very wrong." I only believe > > > it is a mathematical model that is "somewhat" flawed for chess and > > > presumable also for other games. Do you have an alternative that might > > > be more accurate? > > > > > > > > > - Don > > > > I don't have very precise data about computer vs human. But in computer > > vs computer, I believe some of the data of your scalability study > > clearly demonstrated that strength of MoGo against Leela scaled > > differently from the strength of MoGo against GNU. But my memory of that > > data may not be very good. > > Yes, I think you are remembering the FatMan Mogo study where Fatman > didn't scale nearly as well with time. > > The idea that I'm resistant to is NOT that some programs scale better > than others, I take that as a given. > > In the graph, it was very clear that Mogo scaled better at higher > levels, I don't care about that. What I am asserting is that the 2600 > version of FatMan in the study is going to play about 2600 ELO strength > against every opponent, not just Mogo. And the stronger Mogo version > will on average tend to be stronger against all other opponents. If you > could somehow magically place a bunch of humans of widely varying > strengths in this study, as a kind of control, it would have a > relatively minor affect on the 2 respective lines. > > I guess my assertion is that Occam's Razor should be applied here until > we have really good reason to believe differently. If the study shows > 2600 ELO we do not need to explain it away by some more complicated > theory. If my ego were hurt by the fact that Mogo scales better, I > could easily construct a theory that explained it away. This is what we > tend to do when we don't want to believe something. That's what I > think is being done with the argument that improvement against computers > doesn't translate to improvement against humans. Sort of a false > modesty - when I beat a much stronger player once I made excuses for > him, my brain was not ready to fully accept the win. > > I am probably older than many here (I'm 52) and I have some historical > perspective to look back on with computer chess. Unfortunately I don't > have anything in print to back this up going back 30 years, but I know > the general feeling many had is that computer chess scalability and > skill doesn't scale to humans. In one of the earliest scalability > studies it was seen that improvement is pretty remarkable at increasing > depths of search and I do remember this being explained away - a remark > to the effect that if this is to be believed, then in a few years we > would have a computer grandmaster and it being laughed off so to speak. > Guess what happened? > > I would like to put this to rest (and be proved correct) but > unfortunately I don't have any proof of my assertions either. It comes > down to an opinion. I'm willing to be proved wrong but we all know how > difficult it is to get reliable evidence from human played games. It > really would require thousands if not millions of fairly played human > computer games at a huge variety of controlled levels and conditions to > make sense of this. The scalability study I did involved levels that > cannot be practically played against humans in quantity. Making a > fair study against humans is real messy and the fact that humans adapt > to specific opponents better than humans will tend to suppress the > ratings of the machines artificially. Humans also try harder against > stronger opponents, which also skews the results. > > Something interesting happened in computer chess about 20 years ago, > give or take. There was a sudden explosion of interest in computer > chess when stand-alone chess computers started getting strong enough and > popular enough that a lot of chess players would purchase them. They > started getting official ratings and playing in tournaments. But > people very quickly learned how to play against them (someone even wrote > a book about how to be computers) and it was if the progress of the > computers stalled. We "knew" that computers were continuing to improve > but it wasn't "translating" against humans. Again, a few were > postulating that they had reached a plateau and that from now on you > would only see very minor progress. There was something like a 200 > ELO barrier that computers had to overcome to make up for the fact that > most chess players no longer feared the computers and knew how to play > them. But it was a local phenomenon. It was a temporary glitch. > > There have always been certain players who specialize in beating > computers and a style developed where you could get into a closed > position and play a slow developing sneak attack against the computers > king side. This style worked against most of the weaknesses that > computers had. It created some serious intransitivities such that > sometimes a low rated player would be unusually good at beating higher > rated computers. As far as I know there was no general solution that > fixed this, computers simply kept improving to the point where their > strengths dominated the game. And of course over time as computers > search deeper and deeper this weakness also gradually goes away. You > almost always have to make some compromise to get the kind of game you > want (like in GO high handicap games where KIM said you HAVE to > overplay) and thus this intransitivity was a local phenomenon, it > applies mostly to computer less than 2400 ELO in strength for > instance. > > This is why I currently believe these things are temporary and local. > You cannot continue to get better without discarding some of your > weaknesses. And of course your strengths can be used to neutralize your > weaknesses, like the tennis player who has such a strong forehand that > attacks and the opponent cannot get to his weak backhand. Take aways > his strengths, and his weaknesses start to become big factors. > > Rémi, I know that you are a bit of an expert on rating systems and I > make heavy use of your bayeselo program - a beautiful tool. I wish you > would produce a command line version that didn't require interactivity. > Anyway, have you ever considered a rating system that could somehow > measure strength in more than 1 dimension? I suppose a huge problem > would be getting statistically reasonable samples, but it would be > pretty cool if you could detect the rocks, scissors, paper effect and > given enough data be able to predict performance against particular > styles of players. That would require a system that used more than just > a single value to specify playing ability. It might be that if you > assumed just 2 or 3 dimensions instead of 1, it could be a major > improvement. Perhaps most players make use of 2 or 3 major cognitive > abilities to play go? I know in chess I encountered superior players > who could not calculate as well as I could, but they were clearly better > players than I. And there were others just the opposite, seemed to > know nothing about chess but were so tactically tenacious that they were > difficult to beat. > > - Don > > > > > > > > > Rémi > > _______________________________________________ > > computer-go mailing list > > computer-go@computer-go.org > > http://www.computer-go.org/mailman/listinfo/computer-go/ > > _______________________________________________ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/