Re: Help with a new MET

Joseph Heled Sun, 10 Nov 2019 10:08:41 -0800

Hi Ian,

You obviously put a lot of thought and effort into this project. Here are
some initial reactions (more might come later).


1. RE rollouts, you probably want to use the Python interface.
Unfortunately I have been out of the loop for a long time, and can't
remember the details.

2. I suggest you contact a statistician. I suspect the number of trials you
need is quite large. The most important thing will be to establish a
confidence interval for your equities.

3. I generated the early MET files for GNUBG, and dabbled a little in
testing the effect of different tables. At the time I could not see a real
difference in chequer play. So, while MET tables are theoretically
interesting, personally I find it hard to believe it will make GNUBG
stronger in practice. I would be delighted to be proven wrong.

4. So, if you are (say) confident with the results up to 5 (or 7), I would
start by establishing the difference between the Variable MET and mec26
and/or XG for 5 or 7 point matches. This might be a good way to explore the
issues.

5. I would like to see a short description of the Variable MET method, if
you are willing to share it at this stage.

Cheers, Joseph




On Mon, 11 Nov 2019 at 01:20, Ian Dunstan <[email protected]> wrote:

> Hi team,
>
> I am just finishing a project that has taken me many months (years), the
> creation of a new backgammon MET. My nearly finished MET is called 'PR2'
> and it is a combination of both rollouts and a theoretical MET. It uses
> rollout trials (+500k) of all match scores less than 9a9a and then a
> specially developed theoretical MET I call the 'Variable MET' to extend the
> rollout results to 31a31a. The Variable MET could generate all of the 9a9a
> MET probabilities in its own right, however, I use it in my MET as an
> extrapolation tool. A lot of time and effort has gone into the accuracy of
> both the 9a9a rollouts and the development of the Variable MET, more so the
> latter.
>
> I could go into a lot more detail if you wish, however, what I would like
> to do now is test my new PR2 MET and your help with 1) below is what I care
> most about. Tony Lezard (of Dueller renown) suggested I contact the Gnubg
> team after I asked him for help on testing.
>
> 1) What I would like to do is test my PR2 MET by doing a series of 5pt
> matches where one Gnubg player uses the PR2 MET and the other the now
> standard Kazaross-XG2 MET (in particular). My 'PR2' player faces himself in
> 500 by 5pt matches at a time and the results are recorded. The moves I
> don't care about, who won that 500 match series is all that is important.
> I know that Gnubg can play itself now, however, not with different MET's
> loaded and not without a lot of human input (every game end requires a
> manual prompt from the user for the next game to begin). That way of doing
> things is unworkable for me. What I need is a set-and-forget solution,
> something I can start to do overnight and in the morning the match wins is
> reported as something like (say) 257-243.
>
> I am only guessing how long 500 by 5pt matches would take me, even if
> fully automated. Additionally, I do not know how many sets of 500 by 5pt
> matches I would need to do to see a significant difference in METS. Maybe
> 5000, 50000 or 500000. After seeing the difference in equity the PR2 MET
> can sometimes produce I am hoping for the former.
>
> I have a friend who is a lot more computer savvy than I am and he has
> started playing around with different sockets/ports and instances of Gnubg.
> He tells me "You actually need 3 instances of gnubg running - I run all
> three without the graphical interface, only pure terminal versions".
> However, before he goes to too much trouble I thought it best to contact
> the Gnubg team and see if you can help.
>
> Maybe you only have to change "a couple of lines of programming" as
> someone on my forum suggested (lol). It won't be that easy, I know!
>
> 2) Jim Segrave thought this issue might be of interest to the team.
>
> https://i.postimg.cc/brQf7sVw/4a1a-C-seed-6987657-1036800.png
>
> There should not be any difference in the cubeless and cubeful results,
> however, there is. I think the cubeless results are right and the cubeful
> result discrepancy is due to some cubeful calculation drift. This
> particular rollout shows the discrepancy near the 5th dp. In other rollouts
> I did I believe the discrepancy crept into the 4th dp.
>
> My PR2 MET tries for accuracy to the second dp(%) in all of the 9a9a
> entries I rolled. E.g. I have 1a2aC as 68.36% after compiling over a
> million trials and that should be accurate to 2dp(%).
>
> Here is a further example. When I first rolled out 8a1a over 1 million
> times in a single rollout I got a final cubeful result of ~0.10705.
> However, I happened to be around my computer to watch the result at ~93%
> completion and see the equity climb steadily from 0.10688 for over an hour
> to reach 0.10705. So what, you may ask? Well, I have watched enough
> rollouts to suppose that the 3rd decimal place if not the fourth should be
> set in stone at nearly a million trials. Additionally, rollouts will have
> the equity jumping up and down a bit due to variance, this rollout was not
> doing that, equity just went up and up in this case.
>
> I was very suspicious so I then checked my 8a1a result by choosing 5 new
> seeds and doing 5x12960 trial rollouts using the same Gnubg settings. I got:
>
> 0.1067726
>
> 0.1068255
>
> 0.1066774
>
> 0.1066116
>
> 0.1067552
>
> The mean of these means is ~0.10672.
>
> In terms of a MET entry that would be 10.67% vs 10.71% for the million+
> rollout. 5x12960=64800 trials is not really a lot, however, I have done
> enough rollouts to know something is probably wrong here. I repeated this
> exercise with another million+ trial rollout vs 5x12960 trials. In this
> second case, the 5x12960 results were all close to the mean 89.70% while
> the million+ rollout was 89.45%. Again, very different and the million+
> trials are inaccurate in my opinion.
>
> I am guessing that there is some problem with the cubeful algorithm that
> first creeps in at the 7th significant figure (sf), then migrates to the
> 6th, 5th, 4th sf etc... all governed by the number of trials. For an
> average user, they won't ever see a problem at 5184 trials or even 51840
> trials. However, I saw a problem with 518400 trials and above. At the time
> of first seeing this issue, I abandoned the 25 x 1M+ trials I had done for
> my MET project and started again. The way around this problem for me was to
> do sets of 46656 trials and tabulate them carefully.
>
> An esoteric problem for sure and one that might be nearly irrelevant to
> everyone except me. However, there might be an easy remedy that has to do
> with increasing the number of sf used in Gnubg's cubeful algorithm(s).
>
> 3) Lastly, this is a small display problem to consider.
>
> Since in building a 31a31 MET I would check its extremities quite
> regularly to see if I had the right PR2 MET version loaded and I noticed a
> problem. There is a display problem at 23a31a where the equity for 25a31a
> is shown instead. Incidentally, the 31a23a equity is correct in the Gnubg
> table. You will not see a problem in the display of most of the MET's you
> have loaded (probably all the default ones you have) since a calculation
> will internally extrapolate results from ~15a15a (mec.c perhaps). My PR2
> MET is different, the extrapolation calculations Gnubg does for other MET's
> do not start until after 31a31a. I think you have a small address problem
> to fix.
>
> Kind regards,
>
> Ian Dunstan
> (Australian Backgammon Federation Director)
>
>

Re: Help with a new MET

Reply via email to