My best player (TD trained, race & contact networks, a couple extra inputs beyond the standard Tesauro ones) has an average error of 0.0164ppg/move in the contact set, so not surprisingly worse than GNUbg (I assume 1125 means 0.01125ppg/move?).
I also was curious which benchmark set was most relevant for predicting match score, since of course a real game is a mixture of the positions. I took a bunch of my players, of varying skills, and calculated the average error rate for the three benchmark sets; and also played each against PubEval for 40k cubeless money games. Then I regressed the score in those games against the benchmark ERs to see which was most important (using R^2 as a proxy for importance). Turns out the contact benchmark is most relevant, followed by crashed. Race is not that important. Details here: http://compgammon.blogspot.com/2012/02/gnubg-benchmark-results.html On Feb 12, 2012, at 8:51 AM, Øystein Schønning-Johansen wrote: > I've looped through all 'm'-positionsThe following way: > > For each postion I find if the best move with my evaluator, and find if my > move is among the candidates in the list of moves. If it does not make the > best move, I add the error to the total. If my evaluators move is not among > the candidaes at all, I assign the same error as the worst move among the > candidates. > > For all positions in contact.bm, GNU backgammon will have an error of about > 1125, (IIRC) > > Please report how your players make it. > > -Øystein > > > 2012/2/12 Mark Higgins <[email protected]> > Does anyone have the average error stats for 0-ply gnubg on the contact > benchmarks? > > I see race & crashed results at Joseph's page here: > > http://homepages.ihug.co.nz/~peps/ngb/index-top.html > > but can't find the contact result anywhere. (Though I'd guess it's pretty > close to the crashed error, ie around 0.01ppg/move.) > > > > > On Feb 10, 2012, at 6:26 AM, Øystein Schønning-Johansen wrote: > >> 'r' is the seed used for the rollout, I think >> >> Sound likely, since there is an 'r'-line for every rollout result. But there >> is no code lines in perr.py to conferm it. I guess we can trust your memory >> on that. >> >> 'o' is the cube rollout, and the numbers are the rollout values of the >> outcome probability, >> >> -Øystein > >
_______________________________________________ Bug-gnubg mailing list [email protected] https://lists.gnu.org/mailman/listinfo/bug-gnubg
