ELO is much better measure than Dan because it has a
strict mathematical meaning.   Of course all rating systems are
approximations of playing ability and cannot be precisely measured.

Intransitivity (between humans and computers) is probably a lot greater in
Go than in simpler games such as chess.   It's a factor even in chess but
easily dealt with.    If you have a program that beats another 97% of the
time,  it's about 600 ELO stronger.     But it may not do 600 ELO better
against humans.    It may be that we can approximate that.

In computer chess, Larry has a rule of thumb formula for converting this
transitivity to human play.    Maybe something like it would work well in
computer GO.     I will ask him about it.

If we do a study,  it would make sense to have at least 3 or 4 different
programs involved that are believed to be highly scalable and run them all
at many different levels.

I could run such as study if I were given ssh access to enough machines and
instructions from the owners about how many cores I can use.    The programs
should have settings that allow me to set the playouts per move so that we
don't have to worry about which computers are faster or slower than others.
   Basically I would run the test by remote shell into the respective
machines.  The test would happen on my computer using twogmp or some other
such tool and the programs would be specified via remote shell.

But first we would have to decide which programs to run - the more program
we choose the more difficult it is to get reasonable samples - the bad news
is that we need a lot of games per program/level.   I think that 5,000 games
for a given player gives us an error margin of something around plus or
minus 10 ELO.

Don







On Fri, Jun 17, 2011 at 6:58 AM, Jacques Basaldúa <[email protected]> wrote:

> Stefan Kaitschick wrote:
>
> > This is really a quirk of the go ranking system, which
> > defines strength as the ability to give handicap stones.
> > If strength were defined as being able to win a certain
> > percentage of even games, things would be different.
>
> That is a very important issue. Assuming
>
> 1 stone = xx Elo points
>
> is wrong and becomes even worse as the level rises.
> Another issue we may have not taken seriously enough
> is intransitivity.  We say it is not very important,
> and that may be true for humans, but nothing like a
> computer to force "non important" things to become
> most important.
>
> When I was experimenting with learning playout weights
> using GAs (something abandoned I have something much
> better in progress) I found easy to make a chain where:
>
> B is 50 Elo point stronger than A
> C is 50 Elo point stronger than B, D than C, E than D
>
> And when you confront E vs A and expect the difference
> to to be 200-ish it is only 30 points. And all this is
> done with appropriate Agresti-Coull confidence
> intervals and significance tests for the difference, so
> it is not a conclusion based on a wrong setup.
>
> I bring this here because it is hardly conceivable
> that a program does not scale in self play, something
> must be broken. But I can see Hideki's point that
> this scaling may take the program nowhere when it
> is about advancing quantum leaps. The infinite
> convergence of the tree is a beautiful argument but
> of no practical application. The tree simply does not
> grow in directions the playouts discourage because
> they don't understand the precise sequences.
>
> If we do another study, one big difficulty will be
> finding strong and different (= non MCTS) opponents
> to make the pool a little less homogeneous. Else, the
> study may be too much influenced by Elo rating
> limitations.
>
>
> Jacques.
>
>
>
>
>
> ______________________________**_________________
> Computer-go mailing list
> [email protected]
> http://dvandva.org/cgi-bin/**mailman/listinfo/computer-go<http://dvandva.org/cgi-bin/mailman/listinfo/computer-go>
>
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Reply via email to