Re: [computer-go] Scalability study suggestion
i agree with everything except for the seeding. it takes very few games (especially with the distribution you suggest) to get somewhat near the right spot. with 500-ish games being played per player, the initial time to get into the right place isn't unreasonable. keep in mind that if you totally outmatch someone, likely the time they spend thinking will be peanuts compared to you, because we're looking at doublings. so those initial where do i roughly belong games will go relatively quickly. another way of thinking about this is that if you use a uniform prior, the convergence to the actual value isn't much slower than if you use a (potentially biased) prior. fat14, mogos 17, 18, big16 and potentially big18 might have all been seeded too high this way, and they're the ones we want the most accurate numbers on at this point. s. - Original Message From: Chuck Paulson [EMAIL PROTECTED] To: computer-go computer-go@computer-go.org Sent: Saturday, February 2, 2008 1:02:15 PM Subject: [computer-go] Scalability study suggestion !-- /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0in;margin-bottom:.0001pt;font-size:12.0pt;font-family:Times New Roman;} a:link, span.MsoHyperlink {color:blue;text-decoration:underline;} a:visited, span.MsoHyperlinkFollowed {color:purple;text-decoration:underline;} span.EmailStyle17 {font-family:Arial;color:windowtext;} _filtered {margin:1.0in 1.25in 1.0in 1.25in;} div.Section1 {} -- I have been following the scaling study closely. Thanks for everyone for gathering such interesting data and especially to Don and the people donating computers. I have 2 suggestions that could help increase the amount of information gathered from all the CPU hours being used to play these games. The first suggestion is about changing the algorithm used to pair players together, and the second is about “seeding” players to an appropriate starting position. Suppose we have two players A and B who each have about a 50% chance of winning when they play each other. We expect to gain about 0.5*log2(0.5) + 0.5*log2(0.5) = 1 bit of information from every game they play. On the other hand, if player A has a 95% chance of beating player B we get about 0.95*log2(0.95) + 0.05*log2(0.05) = about 0.29 bits of information when they play each other. Therefore, the more often you are playing nearly equal players, the more information is gained. Now suppose we have a list of players ranked 0 to N, where the lower the ranking, the stronger the player. Assume that player n takes twice as long to play a game as player n+1 for any n. Also assume that the Elo difference between player n and player n+1 is 80 points, and that the cost of player 0 playing a game is 1.0 time units. These assumptions seem reasonable for the upper levels of the study if we only consider the Mogo programs up to Mogo_16 (the ones that have about 500 games). Using these assumptions we can build the following table. RankCost WinProb Bits Bits/Cost 0 2. 0.5000 1. 0.5000 1 1.5000 0.3869 0.9627 0.6418 2 1.2500 0.2847 0.8618 0.6895 3 1.1250 0.2008 0.7234 0.6431 4 1.0625 0.1368 0.5758 0.5419 5 1.0313 0.0909 0.4395 0.4262 6 1.0156 0.0594 0.3249 0.3199 7 1.0078 0.0383 0.2344 0.2326 8 1.0039 0.0245 0.1660 0.1654 9 1.0020 0.0156 0.1160 0.1157 101.0010 0.0099 0.0801 0.0801 For example, if player 0 plays player1 then the total cost is 1.5 because player 0 takes 1 unit of time and player 1 takes 0.5 (half of player 0 time) for a total cost of 1.5 units of time. The Elo difference is 80, so using the formula P = 1 / (1 + 10^(80/400)) = 0.3869 we see that player 1 has about a 39% change of winning. The bits of information we expect to get are 0.3869*log2(0.3869) + (1 – 0.3869)*log2(1 – 0.3869) = 0.9627. The Bits/Cost is 0.9627/1.5 = 0.6418. The table shows the results of player 0 playing all players ranked 0 to 10 ranks lower. If player 0 plays himself the cost is 2.0, the probability of a win is 50%, resulting in 1 bit of information and the Bits/Cost is 0.5. From the table, to maximize the information gain per unit of time, the best games are played between players that are 2 ranks apart. That has to be balanced against having a wide variety of games to get robust rankings. I believe the current method is to pick 3 random opponents
Re: [computer-go] Scalability study suggestion
Hi Chuck, Thank you for your interesting suggestions.I have previously considered a system where the distribution is based on how many contestants. For instance if there are hundreds of players you would want to generate best of 5 or 6 or more, but if there were only 3 or 4 you might want best of 2. But it's not just the number of players either, if you have 1000 players but they are all within 50 ELO of each other, you could almost choose your opponent randomly. There is definitely a seed problem because each instance of the tester is ignorant of the players strengths. The autotester eventually discovers their approximate ELO level and then it's ok but only after a number of games, many of which are ugly mismatches. However, when I add a player to the autotester I do try to guess at a reasonable rating in order to prevent this.The autotester is not too aggressive about rating players - they are allowed to change fairly slowly so that an initial seeded will stick. I am definitely interested in a better pairing algorithm, and I have plans to improve the autotester and distribute it as freeware when it's easier to use (right now you must know sql to really get good use out of it.) What I actually have in mind is some kind of quick monte carlo simulation at scheduling time in order to make some kind of decision that fits whatever constraints we think are useful. I won't have a chance to make any serious changes that require a lot of testing in the near future, but when we do the next study, I can at least change the N in best of N to a higher number and decrease the number of mismatches. Something I was big on a year or two ago was to schedule as if the players were playing a series of swiss tournaments.This is a very natural system which mirrors how programs and people would get paired in the real world. However, I ran some simulations and the distribution is horrible. There are pockets of players you would rarely play and other pockets you are very likely to play and the frequency doesn't closely match the strength of the players - in other words you may be much more likely to face a player 600 ELO distant that one 200 ELO different. I ran this simulation 3 different ways. One of the ways was based on correct pairings where you take the ratings of the players into account (top half plays bottom have in each scoring section of each round.) The other way was to assume nothing about their actual strengths and pair them as if they were all playing in their first tournament ever. A third way was to give them random ratings for pairing purposes, but pair them the correct swiss way assuming these ratings were their actual ratings. All of these systems generated uneven pockets. - Don Chuck Paulson wrote: I have been following the scaling study closely. Thanks for everyone for gathering such interesting data and especially to Don and the people donating computers. I have 2 suggestions that could help increase the amount of information gathered from all the CPU hours being used to play these games. The first suggestion is about changing the algorithm used to pair players together, and the second is about “seeding” players to an appropriate starting position. Suppose we have two players A and B who each have about a 50% chance of winning when they play each other. We expect to gain about 0.5*log2(0.5) + 0.5*log2(0.5) = 1 bit of information from every game they play. On the other hand, if player A has a 95% chance of beating player B we get about 0.95*log2(0.95) + 0.05*log2(0.05) = about 0.29 bits of information when they play each other. Therefore, the more often you are playing nearly equal players, the more information is gained. Now suppose we have a list of players ranked 0 to N, where the lower the ranking, the stronger the player. Assume that player n takes twice as long to play a game as player n+1 for any n. Also assume that the Elo difference between player n and player n+1 is 80 points, and that the cost of player 0 playing a game is 1.0 time units. These assumptions seem reasonable for the upper levels of the study if we only consider the Mogo programs up to Mogo_16 (the ones that have about 500 games). Using these assumptions we can build the following table. RankCost WinProb Bits Bits/Cost 0 2. 0.5000 1. 0.5000 1 1.5000 0.3869 0.9627 0.6418 2 1.2500 0.2847 0.8618 0.6895 3 1.1250 0.2008 0.7234 0.6431 4 1.0625 0.1368 0.5758 0.5419 5 1.0313 0.0909 0.4395 0.4262 6 1.0156 0.0594 0.3249 0.3199 7 1.0078