[Bug-gnubg] A new variance reduction technique

bob koca Sun, 29 Mar 2009 12:50:45 -0700

 

  This was also posted to the forums at bgonline.org,
 
Bob Koca
 
---
 
I have a method that I think will make backgammon rollouts more efficient in 
that the std. errors will be smaller for the same number of games rolled out.  
I don’t have the skills though to actually program it. Let me first give the 
basic idea of the theory with an example. To read more about this the  key 
words would be 2 sample t-tests and matched pairs.  Nearly all college level 
introductory stats textbooks would have a chapter on this. 
 
  Suppose you want to measure the difference  Mh – Fh where Mh is the average 
height of  married males and Fh is the average height of married females.  One 
way to do this is to take a random sample of married males and find the sample 
average and deviation and also find a random sample of females and find the 
sample average and deviation.  If the sample sizes are large enough and/or the 
population itself is close to normal then the sample averages each have an 
approximately normal distribution (with the true population average as its mean 
and deviation determined by its sample deviation). The difference is also then 
normal. This means our best guess for the true population difference is just 
the difference in the sample means.  The two samples were taken independently 
and the variance of the sum or difference of two random variables is the sum of 
the two individual variances. If the s dev are equal this gives a s dev of the 
difference equal to SQRT(2) times the STD dev of a single population.  This is 
essentially how rollouts are now done to compare play A to play B. 
 
  A second way to estimate the difference in heights it is to use matched 
pairs. Instead of choosing males randomly and also females randomly choose 
married couples randomly. For each couple find the difference (male height – 
female height).  This gives us a single list of numbers. Its average will be 
our best guess for the true population difference and the deviation of this 
list can be used to find the std error for our guess.  The std. error will be 
lower for this case as it takes advantage of the positive correlation between 
husband and wife heights.  As a near extreme case imagine that a husband is 
nearly always very close to being 3 inches taller than his wife. Use of the 
matched pairs technique shows this fact quickly. The greater the positive 
correlation the greater benefit is obtained. I believe that the matched pairs 
technique can be used in rollouts and will give a smaller standard error for 
the same number of trials. 
 
    Suppose one wants to do a rollout to compare play A with play B. What would 
be the meaning of having a positive correlation here? It would mean that if 
game n of say 1296 trials has a certain amount of luck for play A it would have 
about that same amount of luck for game n of the rollout of  play B as well. 
Using the same dice stream is a better than nothing but somewhat feeble way to 
do this as the games could quickly diverge. 
  
      A better way is to ensure that when play A gets its best roll that play B 
also does and vice versa. The evaluation feature can make this possible. Choose 
a roll for one of the play’s rollouts at random. But then instead of using that 
same roll or another randomly chosen roll for the other play’s rollout make it 
so that the ranks are the same. So suppose that the rollout of play A just got 
its 5th best (out of 36) roll. Determine the 5th best roll for the rollout of 
play B and use that roll. If there is a tie break it randomly. At the end of 
each trial look at result of A – result of B.  This gives a single list of 
numbers. The average value gives your estimate for equity of A – equity of B 
and the s.dev of the list gives the std. error. Conventional variance reduction 
techniques such as Dahl’s algorithm, and using each each die roll once could 
still be used. 
 
     Pseudocode (or perhaps I should call it pseudo-pseudo code) on how to do 
this is at the end. I am not a programmer so I hope my thoughts are 
interpretable. 
   
 How to test the value of this once it is has been coded up? Well one could 
just look at the std error of the estimate of the difference between play A and 
B. But I think something like the following would be more convincing:  Take 
several instances of play vs play decisions. RO (conventionally) with a huge 
number of trials so that the true difference in the win chances are known very 
accurately. Now using the same bot RO with just 36 trials the play A vs play B 
choice. Look at the discrepancy between the RO result and the true difference 
we have already found. Do this say 100 times. The std. deviation of those 
errors gives a measure of how accurate the 36 game ROs  are. Now repeat using 
the same bot but with the above technique.  Compare the std deviation of the 
error from these rollouts against those obtained from the conventional 
rollouts. 
  
  PSEUDO CODE to estimate the difference in win% between play A and play B. 
DEFNS: 
A Strand is a rollout starting with a certain play, here either play A or play 
B. 
 A Trial is a certain game being rolled out. For example a rollout might have 
1296                  trials for each of two strands. 
 
PROCEDURE: ForEachRoll:
 
1)      Generate a random number  (COMMENT could be done in conjunction with 
the rotation idea for rollouts that are powers of 36)
If the current trial is unresolved for both strands:  Then
2)      Using 2 ply analysis find the rank out of the 36 possible dice rolls 
for that roll for the rollout in which play A was made initially.  (COMMENT: 
Note that this is not computationally expensive since is needed anyways for 
Dahl V.R.)
3)      Using two ply analysis find the roll which has the same rank for the 
strand in which  play B was made initially.  If there is a tie then break it 
randomly. 
4)      Using 2 ply analysis find and make the best plays for each. 
5)      Use Dahl VR and adjust the running count for each strand.
 
If only one strand is unresolved then, 
2)      Choose a die roll randomly.
3)      Play the move for that strand using 2-ply
4)      Use Dahl VR and adjust the running count for each strand. 
 
PROCEDURE ForEachTrial
 
Run ForEachRoll 
If Both strands are not resolved Then Run ForEachRoll
If both strands are resolved then record the following in 1XN arrays where N = 
number of trials to run. 
 
1)      A = game result for  strand A + Dahl VR adjustment
2)      B = game result for strand B + Dahl VR adjustment
3)      Diff = A - B
 
   
 
PROGRAM:  Do full RO
Inputs:  N = number of trials. 
 
Set Counter = 1
Until Counter = N+1 do:                    
  Run Procedure ForEachTrial
  Increase N by 1
Use the array A to give estimate of win%  and std error for how often play A 
wins.
Use the array B to give estimate of win% and std. error for how often play B 
wins.
Use the array Diff to give estimate of difference in win% and its std error


 

_________________________________________________________________
Express your personality in color! Preview and select themes for Hotmail®.
http://www.windowslive-hotmail.com/LearnMore/personalize.aspx?ocid=TXT_MSGTX_WL_HM_express_032009#colortheme

_______________________________________________
Bug-gnubg mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/bug-gnubg

[Bug-gnubg] A new variance reduction technique

Reply via email to