Re: [computer-go] Scalability study suggestion

2008-02-02 Thread steve uurtamo
i agree with everything except for the seeding.  it takes very few games
(especially with the distribution you suggest) to get somewhat near the
right spot.  with 500-ish games being played per player, the initial time
to get into the right place isn't unreasonable.

keep in mind that if you totally outmatch someone, likely the time they
spend thinking will be peanuts compared to you, because we're looking
at doublings.  so those initial where do i roughly belong games will go
relatively quickly.

another way of thinking about this is that if you use a uniform prior,
the convergence to the actual value isn't much slower than if you
use a (potentially biased) prior.  fat14, mogos 17, 18, big16 and potentially
big18 might have all been seeded too high this way, and they're the
ones we want the most accurate numbers on at this point.

s.

- Original Message 
From: Chuck Paulson [EMAIL PROTECTED]
To: computer-go computer-go@computer-go.org
Sent: Saturday, February 2, 2008 1:02:15 PM
Subject: [computer-go] Scalability study suggestion





 
 

!--

 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;margin-bottom:.0001pt;font-size:12.0pt;font-family:Times 
New Roman;}
a:link, span.MsoHyperlink
{color:blue;text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{color:purple;text-decoration:underline;}
span.EmailStyle17
{font-family:Arial;color:windowtext;}
 _filtered {margin:1.0in 1.25in 1.0in 1.25in;}
div.Section1
{}
--






I have been following the scaling study closely. Thanks for everyone
for gathering such interesting data and especially to Don and the people
donating computers.
 

  
 

I have 2 suggestions that could help increase the amount of information
gathered from all the CPU hours being used to play these games. The first
suggestion is about changing the algorithm used to pair players together, and
the second is about “seeding” players to an appropriate starting
position.
 

  
 

Suppose we have two players A and B who each have about a 50% chance of
winning when they play each other. We expect to gain about 0.5*log2(0.5) +
0.5*log2(0.5) = 1 bit of information from every game they play. On the other
hand, if player A has a 95% chance of beating player B we get about
0.95*log2(0.95) + 0.05*log2(0.05) = about 0.29 bits of information when they
play each other. Therefore, the more often you are playing nearly equal players,
the more information is gained.
 

  
 

Now suppose we have a list of players ranked 0 to N, where the lower
the ranking, the stronger the player. Assume that player n takes twice as long
to play a game as player n+1 for any n. Also assume that the Elo difference
between player n and player n+1 is 80 points, and that the cost of player 0
playing a game is 1.0 time units. These assumptions seem reasonable for the
upper levels of the study if we only consider the Mogo programs up to Mogo_16
(the ones that have about 500 games). Using these assumptions we can build the
following table.
 

  
 

RankCost WinProb  Bits  Bits/Cost
 

0  2. 0.5000 1. 0.5000
 

1  1.5000 0.3869 0.9627 0.6418
 

2  1.2500 0.2847 0.8618 0.6895
 

3  1.1250 0.2008 0.7234 0.6431
 

4  1.0625 0.1368 0.5758 0.5419
 

5  1.0313 0.0909 0.4395 0.4262
 

6  1.0156 0.0594 0.3249 0.3199
 

7  1.0078 0.0383 0.2344 0.2326
 

8  1.0039 0.0245 0.1660 0.1654
 

9  1.0020 0.0156 0.1160 0.1157
 

101.0010 0.0099 0.0801 0.0801
 

  
 

For example, if player 0 plays player1 then the total cost is 1.5
because player 0 takes 1 unit of time and player 1 takes 0.5 (half of player 0
time) for a total cost of 1.5 units of time. The Elo difference is 80, so using
the formula P = 1 / (1 + 10^(80/400)) = 0.3869 we see that player 1 has about a
39% change of winning. The bits of information we expect to get are
0.3869*log2(0.3869) + (1 – 0.3869)*log2(1 – 0.3869) = 0.9627. The
Bits/Cost is 0.9627/1.5 = 0.6418. The table shows the results of player 0
playing all players ranked 0 to 10 ranks lower. If player 0 plays himself the
cost is 2.0, the probability of a win is 50%, resulting in 1 bit of information
and the Bits/Cost is 0.5.
 

  
 

From the table, to maximize the information gain per unit of time, the
best games are played between players that are 2 ranks apart. That has to be
balanced against having a wide variety of games to get robust rankings.
 

  
 

I believe the current method is to pick 3 random opponents 

Re: [computer-go] Scalability study suggestion

2008-02-02 Thread Don Dailey
Hi Chuck,

Thank you for your interesting suggestions.I have previously
considered a system where the distribution is based on how many
contestants.   For instance if there are hundreds of players you would
want to generate best of 5 or 6 or more, but if there were only 3 or 4
you might want best of 2.

But it's not just the number of players either, if you have 1000 players
but they are all within 50 ELO of each other,  you could almost choose
your opponent randomly. 

There is definitely a seed problem because each instance of the tester
is ignorant of the players strengths.   The autotester eventually
discovers their approximate ELO level and then it's ok but only after a
number of games, many of which are ugly mismatches. 

However, when I add a player to the autotester I do try to guess at a
reasonable rating in order to prevent this.The autotester is not too
aggressive about rating players - they are allowed to change fairly
slowly so that an initial seeded will stick.   

I am definitely interested in a better pairing algorithm, and I have
plans to improve the autotester and distribute it as freeware when it's
easier to use (right now you must know sql to really get good use out of
it.)  What I actually have in mind is some kind of quick monte carlo
simulation at scheduling time in order to make some kind of decision
that fits whatever constraints we think are useful.

I won't have a chance to make any serious changes that require a lot of
testing in the near future,  but when we do the next study,  I can at
least change the N in best of N to a higher number and decrease the
number of mismatches.

Something I was big on a year or two ago was to schedule as if the
players were playing a series of swiss tournaments.This is a very
natural system which mirrors how programs and people would get paired in
the real world. However, I ran some simulations and the distribution
is horrible.  There are pockets of players you would rarely play and
other pockets you are very likely to play and the frequency doesn't
closely match the strength of the players - in other words you may be
much more likely to face a player 600 ELO distant that one 200 ELO
different.   

I ran this simulation 3 different ways.  One of the ways was based on
correct pairings where you take the ratings of the players into account
(top half plays bottom have in each scoring section of each round.)   
The other way was to assume nothing about their actual strengths and
pair them as if they were all playing in their first tournament ever.   
A third way was to give them random ratings for pairing purposes, but
pair them the correct swiss way assuming these ratings were their actual
ratings. All of these systems generated uneven pockets.  


- Don



Chuck Paulson wrote:

 I have been following the scaling study closely. Thanks for everyone
 for gathering such interesting data and especially to Don and the
 people donating computers.

  

 I have 2 suggestions that could help increase the amount of
 information gathered from all the CPU hours being used to play these
 games. The first suggestion is about changing the algorithm used to
 pair players together, and the second is about “seeding” players to an
 appropriate starting position.

  

 Suppose we have two players A and B who each have about a 50% chance
 of winning when they play each other. We expect to gain about
 0.5*log2(0.5) + 0.5*log2(0.5) = 1 bit of information from every game
 they play. On the other hand, if player A has a 95% chance of beating
 player B we get about 0.95*log2(0.95) + 0.05*log2(0.05) = about 0.29
 bits of information when they play each other. Therefore, the more
 often you are playing nearly equal players, the more information is
 gained.

  

 Now suppose we have a list of players ranked 0 to N, where the lower
 the ranking, the stronger the player. Assume that player n takes twice
 as long to play a game as player n+1 for any n. Also assume that the
 Elo difference between player n and player n+1 is 80 points, and that
 the cost of player 0 playing a game is 1.0 time units. These
 assumptions seem reasonable for the upper levels of the study if we
 only consider the Mogo programs up to Mogo_16 (the ones that have
 about 500 games). Using these assumptions we can build the following
 table.

  

 RankCost WinProb  Bits 
 Bits/Cost

 0  2. 0.5000 1. 0.5000

 1  1.5000 0.3869 0.9627 0.6418

 2  1.2500 0.2847 0.8618 0.6895

 3  1.1250 0.2008 0.7234 0.6431

 4  1.0625 0.1368 0.5758 0.5419

 5  1.0313 0.0909 0.4395 0.4262

 6  1.0156 0.0594 0.3249 0.3199

 7  1.0078