Is there statistical proof that this is a major issue? I have not reviewed the reference to the forum post but I would like to say this:
If you expect something to happen, you will notice it when it does even if what you think is happening really isn't. I'm not saying it didn't or doesn't happen, but caution is in order. To be sure that this really is what you think, you must play a huge number of games. You must also look at the head to head against the 2 versions in question. To illustrate the magnitude of the problem, there is about a 50/50 chance you will see this phenomenon to some degree even if it doesn't exist. Even to get the error bars under 10 ELO you have to play something like 3000 games between the 2 versions in question and then a similar number of games between other programs with BOTH versions. I have no doubt this is somewhat of an issue, even between MCTS programs in general but I doubt it's major (I could be wrong - depending on how you define major.) To quantify it you must play tens of thousands of games in order to nail this down to within 10 or 20 ELO. You could get by on less games if the problem is bigger of course. If this really is a problem I can minimize the impact of this - at the sacrifice of diversity. In other words, if I increase the diversity with server adjustments, then if you have a strong program, you will have to play weaker opponents more often. This will also make the ratings less stable which could cause people to have false observations (and I'm not claiming this is a false observation, but is there proof that it's major?) Anway, I provide a digest of all results (and the SGF games are available) in order for anyone who wishes to scrutinize the results and show that it's statistically improbable (which it might be.) - Don On Tue, Jun 23, 2009 at 10:30 AM, Hiroshi Yamashita <y...@bd.mbn.or.jp>wrote: > Can you explain this? I don't understand what you are saying. >> > > Once I ran both 1 core and 2 cores Aya on 19x19 CGOS, 2 cores Aya > got high rating. But without 1 core Aya, 2 cores Aya could not get > such a high rating. > > Remi also reported same phenomenon. > > [computer-go] CGOS Deflation or Self-Play delusion? > http://computer-go.org/pipermail/computer-go/2008-February/013995.html > > Regards, > Hiroshi Yamashita > > ----- Original Message ----- From: "Don Dailey" <dailey....@gmail.com> > To: "computer-go" <computer-go@computer-go.org> > Sent: Tuesday, June 23, 2009 11:12 PM > Subject: Re: [computer-go] CGOS 19x19 anchor > > > On Tue, Jun 23, 2009 at 10:10 AM, Hiroshi Yamashita <y...@bd.mbn.or.jp >> >wrote: >> >> I restarted the 19x19 server. >>> >>>> >>>> >>> Thank you. I started my bot. >>> >>> I'm thinking about making some specified version of fuego >>> >>>> >>>> >>> I think using Fuego for anchor is good idea. >>> One problem is maybe latest Fuego will be overrated from >>> weak Fuego anchor. >>> >> >> >> Can you explain this? I don't understand what you are saying. >> >> - Don >> > > > _______________________________________________ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ >
_______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/