Re: Skill level names

Joseph Heled Mon, 08 Jul 2024 15:49:33 -0700

I am in favour of computing and showing all reasonable error measures
(where "all" is subject to preference setting, i.e. the user can choose
which ones she wants to see or hide)


-Joseph

On Tue, 9 Jul 2024 at 09:48, Lasse Hjorth Madsen <
[email protected]> wrote:

> Thanks for pointing this out, Tim. I also think it is more appropriate to
> divide the sum of errors by the total number of moves, rather than the
> number of unforced moves.
>
> From a statistical point of view, whenever you subset, you risk
> introducing bias. Here, we subset all moves to unforced moves only. In this
> case we may create a bias that favor stronger players, as they probably
> know better than the average player how to avoid getting a lot of forced
> non-moves while on the bar.
>
> I generally think it's better to correct past errors that to replicate
> them, so I think gnubg should do just that, and divide by all moves.
>
> I don't expect many players to agree, though.
>
> /Lasse
>
> man. 8. jul. 2024 kl. 21.46 skrev Timothy Y. Chow <
> [email protected]>:
>
>> Ian Shaw wrote:
>>
>> > The scaling of the PR values comes historically from Snowie, which used
>> > the sum of both players' moves as the divisor. Gnubg uses only the
>> > player's unforced moves, which naturally means gnubg error rates are at
>> > least double Snowie error rates. When XG was created Xavier calculated
>> > the error rates using the same method as gnubg, but then divided by 2
>> to
>> > scale them to the match Snowie Error Rate, which is what most people
>> > were familiar with.
>>
>> XG's definition of PR is rather complicated because of its definition of
>> a
>> "decision":
>>
>> http://timothychow.net/cg/www.bgonline.org/forums/197598.html
>>
>> Since PR has become a de facto standard, it makes sense to try to
>> replicate it. But replicating PR would require some additional
>> programming
>> since it's not quite the same as GNU's native error rate calculation.
>>
>> I'm not in favor of dropping Snowie ER entirely. It has its merits, or
>> rather, PR has its pathologies. Neil Robins pointed out one surprising
>> example here:
>>
>> https://www.bgonline.org/forums/webbbs_config.pl?read=154585
>>
>> More generally, as I've stated numerous times on rec.games.backgammon and
>> BGOnline, eliminating forced or obvious moves from the denominator has
>> some strange consequences that most people don't seem to appreciate. One
>> reason we divide the total equity lost by the length of the session is so
>> that errors are weighted according to their *frequency of occurrence in
>> actual play*. If a very unusual type of decision arises and I botch it,
>> then that should not count against me as much as a very common type of
>> decision that I mess up (assuming both types of mistake cost 0.05 each,
>> say). So far so good.
>>
>> But now think about what happens if we delete forced moves from the
>> denominator. That means that errors occurring in games with a lot of
>> forced moves hurt our PR more than errors occurring in games with no
>> forced moves. In two separate games, I might make a error of exactly the
>> same size, but in one game I get unlucky and get closed out. My PR will
>> probably suffer more in the game where I have bad luck, because I'll be
>> dividing my equity loss by a smaller number. Is this what we really want
>> from PR? Maybe, maybe not. It's not obvious to me. A large majority of
>> the
>> backgammon community has somehow gone along with this way of doing things
>> without thinking it through, or even recognizing that there is something
>> to think about here.
>>
>> Somehow people have come to conceptualize a backgammon session as a
>> sequence of quiz problems, where the only role of the denominator is the
>> measure the length of the quiz, but in reality there can be correlations
>> (or anti-correlations) between the *types* of decisions you're presented
>> with in a game and the *number* of decisions in the game. By messing with
>> the denominator in a funny way, PR produces some strange and
>> hard-to-understand effects. ER has the advantage of keeping things
>> simple:
>> the denominator is just the number of rolls. That is the most obvious
>> measure of length, and it has the advantage of being simple to understand.
>>
>> If GNU stops calculating Snowie ER, then it will be very difficult to
>> extract this potentially illuminating and instructive statistic from a
>> backgammon session.
>>
>> Tim
>>
>>

Re: Skill level names

Reply via email to