Thanks for your effort Philippe. Your numbers looks correct.

However, I think it is important to state some more details.

First: Are the games played to completion? Or are the games terminated at
race or bearoff or ...
Second: Does the pubeval evaluate all the position classes? I once did the
mistake in a similar experiment where the pubeval player actually used a
full bearoff look up table.
And then: These are cubeless moneygames I assume. These are not one-point
matches.

(Another potential bug is the opening roll. I guess that it is taken care
of.)

Thanks again for you effort, Philippe.
-Øystein


On Fri, Jan 25, 2019 at 9:51 PM Philippe Michel <[email protected]>
wrote:

> On Tue, Jan 22, 2019 at 08:31:08AM -0800, Robert Edgar wrote:
>
> > Can anyone confirm the score of a recent version of gnubg vs. pubeval? I
> > hacked the source and found that gnubg v1.06 averaged +1.1ppg (82% wins)
> > over 10k games, but a recent paper Papahristou & Refanidis (2017) quotes
> > +0.60 ppg which is only marginally better than TD-Gammon (+0.59). My
> > number seems high, but +0.6 seems too low considering how much effort
> > went into optimizing the gnubg code.
>
> Three 10k games trials with the current net give (for 0 ply evaluations) :
> +0.635ppg (71.1% wins)
> +0.630ppg (70.9% wins)
> +0.645ppg (71.7% wins)
>
> Without counting backgammons the nubers become 0.612, 0.603 and 0.620.
>
> +1.1ppg and 82% wins is simply impossible. There must be some bug in
> your pubeval implementation or usage.
>
> Amusingly, the message quoted in Ian Shaw's answer is from a thread
> started by someone who got a similarly high number (from his own program
> rather than gnubg) and it was due to such a bug :
> https://lists.gnu.org/archive/html/bug-gnubg/2012-01/msg00019.html
>
>
> FWIW, I ran shorter trials at 1 ply and 2 ply.
> 1000 games @ 1 ply : +0.66ppg
> 100 games @ 2 ply : +0.70ppg
>
> If someone is interested, I could do these with 10 times more games (it
> would take a few hours instead of a few minutes) but there would still
> be a lot of uncertainty in the 2 ply result.
>
>
> _______________________________________________
> Bug-gnubg mailing list
> [email protected]
> https://lists.gnu.org/mailman/listinfo/bug-gnubg
>
_______________________________________________
Bug-gnubg mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/bug-gnubg

Reply via email to