Oh yes,  the graphs are still there:

 http://cgos.boardspace.net/study/

 http://cgos.boardspace.net/study/13/

- Don
 

On Thu, 2008-08-28 at 10:10 -0400, Don Dailey wrote:
> On Thu, 2008-08-28 at 09:38 +0200, Rémi Coulom wrote:
> > Don Dailey wrote:
> > > I don't really believe the ELO model is "very wrong."   I only believe
> > > it is a mathematical model that is "somewhat" flawed for chess and
> > > presumable also for other games.   Do you have an alternative that might
> > > be more accurate?   
> > >
> > >
> > > - Don
> > 
> > I don't have very precise data about computer vs human. But in computer 
> > vs computer, I believe some of the data of your scalability study 
> > clearly demonstrated that strength of MoGo against Leela scaled 
> > differently from the strength of MoGo against GNU. But my memory of that 
> > data may not be very good.
> 
> Yes, I think you are remembering the FatMan Mogo study where Fatman
> didn't scale nearly as well with time.  
> 
> The idea that I'm resistant to is NOT that some programs scale better
> than others,  I take that as a given.   
> 
> In the graph, it was very clear that Mogo scaled better at higher
> levels,  I don't care about that.  What I am asserting is that the 2600
> version of FatMan in the study is going to play about 2600 ELO strength
> against every opponent, not just Mogo.  And the stronger Mogo version
> will on average tend to be stronger against all other opponents.  If you
> could somehow magically place a bunch of humans of widely varying
> strengths in this study, as a kind of control,  it would have a
> relatively minor affect on the 2 respective lines.  
> 
> I guess my assertion is that Occam's Razor should be applied here until
> we have really good reason to believe differently.   If the study shows
> 2600 ELO we do not need to explain it away by some more complicated
> theory.    If my ego were hurt by the fact that Mogo scales better, I
> could easily construct a theory that explained it away.  This is what we
> tend to do when we don't want to believe something.    That's what I
> think is being done with the argument that improvement against computers
> doesn't translate to improvement against humans.   Sort of a false
> modesty - when I beat a much stronger player once I made excuses for
> him, my brain was not ready to fully accept the win.       
> 
> I am probably older than many here (I'm 52) and I have some historical
> perspective to look back on with computer chess.  Unfortunately I don't
> have anything in print to back this up going back 30 years,  but I know
> the general feeling many had is that computer chess scalability and
> skill doesn't scale to humans.  In one of the earliest scalability
> studies it was seen that improvement is pretty remarkable at increasing
> depths of search and I do remember this being explained away - a remark
> to the effect that if this is to be believed, then in a few years we
> would have a computer grandmaster and it being laughed off so to speak.
> Guess what happened?   
> 
> I would like to put this to rest (and be proved correct) but
> unfortunately I don't have any proof of my assertions either.  It comes
> down to an opinion.   I'm willing to be proved wrong but we all know how
> difficult it is to get reliable evidence from human played games.  It
> really would require thousands if not millions of fairly played human
> computer games at a huge variety of controlled levels and conditions to
> make sense of this.   The scalability study I did involved levels that
> cannot be practically played against humans in quantity.    Making a
> fair study against humans is real messy and the fact that humans adapt
> to specific opponents better than humans will tend to suppress the
> ratings of the machines artificially.   Humans also try harder against
> stronger opponents, which also skews the results.  
> 
> Something interesting happened in computer chess about 20 years ago,
> give or take.  There was a sudden explosion of interest in computer
> chess when stand-alone chess computers started getting strong enough and
> popular enough that a lot of chess players would purchase them.  They
> started getting official ratings and playing in tournaments.   But
> people very quickly learned how to play against them (someone even wrote
> a book about how to be computers)  and it was if the progress of the
> computers stalled.   We "knew" that computers were continuing to improve
> but it wasn't "translating" against humans.  Again, a few were
> postulating that they had reached a plateau and that from now on you
> would only see very minor progress.    There was something like a 200
> ELO barrier that computers had to overcome to make up for the fact that
> most chess players no longer feared the computers and knew how to play
> them.   But it was a local phenomenon.   It was a temporary glitch.
> 
> There have always been certain players who specialize in beating
> computers and a style developed where you could get into a closed
> position and play a slow developing sneak attack against the computers
> king side.   This style worked against most of the weaknesses that
> computers had.   It created some serious intransitivities  such that
> sometimes a low rated player would be unusually good at beating higher
> rated computers.   As far as I know there was no general solution that
> fixed this,  computers simply kept improving to the point where their
> strengths dominated the game.  And of course over time as computers
> search deeper and deeper this weakness also gradually goes away.   You
> almost always have to make some compromise to get the kind of game you
> want (like in GO high handicap games where KIM said you HAVE to
> overplay)  and thus this intransitivity was a local phenomenon, it
> applies mostly to computer less than 2400 ELO in strength for
> instance.    
> 
> This is why I currently believe these things are temporary and local.
> You cannot continue to get better without discarding some of your
> weaknesses.  And of course your strengths can be used to neutralize your
> weaknesses, like the tennis player who has such a strong forehand that
> attacks and the opponent cannot get to his weak backhand.    Take aways
> his strengths, and his weaknesses start to become big factors.     
> 
> Rémi, I know that you are a bit of an expert on rating systems and I
> make heavy use of your bayeselo program - a beautiful tool.  I wish you
> would produce a command line version that didn't require interactivity.
> Anyway,  have you ever considered a rating system that could somehow
> measure strength in more than 1 dimension?   I suppose a huge problem
> would be getting statistically reasonable samples,  but it would be
> pretty cool if you could detect the rocks, scissors, paper effect and
> given enough data be able to predict performance against particular
> styles of players.  That would require a system that used more than just
> a single value to specify playing ability.    It might be that if you
> assumed just 2 or 3 dimensions instead of 1, it could be a major
> improvement.   Perhaps most players make use of 2 or 3 major cognitive
> abilities to play go?   I know in chess I encountered superior players
> who could not calculate as well as I could, but they were clearly better
> players than I.   And there were others just the opposite, seemed to
> know nothing about chess but were so tactically tenacious that they were
> difficult to beat. 
> 
> - Don
> 
> 
> 
> 
> 
> > 
> > Rémi
> > _______________________________________________
> > computer-go mailing list
> > computer-go@computer-go.org
> > http://www.computer-go.org/mailman/listinfo/computer-go/
> 
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to