On Tue, Jan 22, 2019 at 08:31:08AM -0800, Robert Edgar wrote: > Can anyone confirm the score of a recent version of gnubg vs. pubeval? I > hacked the source and found that gnubg v1.06 averaged +1.1ppg (82% wins) > over 10k games, but a recent paper Papahristou & Refanidis (2017) quotes > +0.60 ppg which is only marginally better than TD-Gammon (+0.59). My > number seems high, but +0.6 seems too low considering how much effort > went into optimizing the gnubg code.
Three 10k games trials with the current net give (for 0 ply evaluations) : +0.635ppg (71.1% wins) +0.630ppg (70.9% wins) +0.645ppg (71.7% wins) Without counting backgammons the nubers become 0.612, 0.603 and 0.620. +1.1ppg and 82% wins is simply impossible. There must be some bug in your pubeval implementation or usage. Amusingly, the message quoted in Ian Shaw's answer is from a thread started by someone who got a similarly high number (from his own program rather than gnubg) and it was due to such a bug : https://lists.gnu.org/archive/html/bug-gnubg/2012-01/msg00019.html FWIW, I ran shorter trials at 1 ply and 2 ply. 1000 games @ 1 ply : +0.66ppg 100 games @ 2 ply : +0.70ppg If someone is interested, I could do these with 10 times more games (it would take a few hours instead of a few minutes) but there would still be a lot of uncertainty in the 2 ply result. _______________________________________________ Bug-gnubg mailing list [email protected] https://lists.gnu.org/mailman/listinfo/bug-gnubg
