[email protected] wrote on 18/01/2009 23:05:53:
> Hi all, > > I was looking for ways to test the rollout code and came up with thefollowing: > > a) Setup evaluation on ply 1, and turn off the pruning net. > b) Setup rollouts to truncate after 1ply, 36 trials, quasi-random dice > on, varredn off, 0ply cubeful play and cube(I believe that pruning > isn't active at 0ply play, but it was turned off anyway to be safe). > c) Start a new money session and roll 21. > d) evaluate the position after the 24/23 13/11 split > e) rollout the position after the 24/23 13/11 split > > Now if everything work as expected d) and e) should give the same > result, and indeed they do: > > Position ID: 4HPkASjgc/ABMA > Match ID: MIENAAAAAAAA > Evaluator: Contact > Win W(g) W(bg) L(g) L(bg) Equity Cubeful > 1 ply: 0.507852 0.137046 0.005140 0.134718 0.004246 +0.018926 +0.025745 > > 0.507852 0.137046 0.005140 - 0.492148 0.134718 0.004246 CL > +0.018926 CF +0.025745 > [0.007014 0.003629 0.000197 - 0.007014 0.003758 0.000291 CL > 0.020000 CF 0.025748] 1r > > The same goes for 2-ply evaluation compared to a rollout truncated at > 2ply and running for 1296 trials > > 2 ply: 0.493744 0.131802 0.004152 0.138802 0.005553 -0.020912 -0.024975 > > 0.493744 0.131802 0.004152 - 0.506256 0.138802 0.005553 CL > -0.020912 CF -0.024975 > [0.001493 0.000924 0.000058 - 0.001493 0.001045 0.000080 CL > 0.004525 CF 0.005778] 1r > > However 3-ply eval compared to a rollout truncated at 3-ply running > for 46656 trials show minor differences: > > 3 ply: 0.507612 0.134868 0.004909 0.135156 0.004396 +0.015449 +0.020794 > > 0.507605 0.134899 0.004911 - 0.492395 0.135157 0.004397 CL > +0.015466 CF +0.020540 > [0.000323 0.000210 0.000014 - 0.000323 0.000219 0.000015 CL > 0.000980 CF 0.001298] 1r > > The effect can be greater with other examples, but this was the > easiest one to report and reproduce. > > There is only one situation that I know of where one should be > careful, and that is when the rolled out position is close to a > double, and that isn't the case here. Hmmm, not sure, maybe in a couple of ply it can become a double (especially for some crazy opponent replies). It may be just a few cases in the tree but ... Have you tried the same thing cubeless ? It only validates the cubeless roullout code but ... MaX.
_______________________________________________ Bug-gnubg mailing list [email protected] http://lists.gnu.org/mailman/listinfo/bug-gnubg
