I am in favour of computing and showing all reasonable error measures (where "all" is subject to preference setting, i.e. the user can choose which ones she wants to see or hide)
-Joseph On Tue, 9 Jul 2024 at 09:48, Lasse Hjorth Madsen < [email protected]> wrote: > Thanks for pointing this out, Tim. I also think it is more appropriate to > divide the sum of errors by the total number of moves, rather than the > number of unforced moves. > > From a statistical point of view, whenever you subset, you risk > introducing bias. Here, we subset all moves to unforced moves only. In this > case we may create a bias that favor stronger players, as they probably > know better than the average player how to avoid getting a lot of forced > non-moves while on the bar. > > I generally think it's better to correct past errors that to replicate > them, so I think gnubg should do just that, and divide by all moves. > > I don't expect many players to agree, though. > > /Lasse > > man. 8. jul. 2024 kl. 21.46 skrev Timothy Y. Chow < > [email protected]>: > >> Ian Shaw wrote: >> >> > The scaling of the PR values comes historically from Snowie, which used >> > the sum of both players' moves as the divisor. Gnubg uses only the >> > player's unforced moves, which naturally means gnubg error rates are at >> > least double Snowie error rates. When XG was created Xavier calculated >> > the error rates using the same method as gnubg, but then divided by 2 >> to >> > scale them to the match Snowie Error Rate, which is what most people >> > were familiar with. >> >> XG's definition of PR is rather complicated because of its definition of >> a >> "decision": >> >> http://timothychow.net/cg/www.bgonline.org/forums/197598.html >> >> Since PR has become a de facto standard, it makes sense to try to >> replicate it. But replicating PR would require some additional >> programming >> since it's not quite the same as GNU's native error rate calculation. >> >> I'm not in favor of dropping Snowie ER entirely. It has its merits, or >> rather, PR has its pathologies. Neil Robins pointed out one surprising >> example here: >> >> https://www.bgonline.org/forums/webbbs_config.pl?read=154585 >> >> More generally, as I've stated numerous times on rec.games.backgammon and >> BGOnline, eliminating forced or obvious moves from the denominator has >> some strange consequences that most people don't seem to appreciate. One >> reason we divide the total equity lost by the length of the session is so >> that errors are weighted according to their *frequency of occurrence in >> actual play*. If a very unusual type of decision arises and I botch it, >> then that should not count against me as much as a very common type of >> decision that I mess up (assuming both types of mistake cost 0.05 each, >> say). So far so good. >> >> But now think about what happens if we delete forced moves from the >> denominator. That means that errors occurring in games with a lot of >> forced moves hurt our PR more than errors occurring in games with no >> forced moves. In two separate games, I might make a error of exactly the >> same size, but in one game I get unlucky and get closed out. My PR will >> probably suffer more in the game where I have bad luck, because I'll be >> dividing my equity loss by a smaller number. Is this what we really want >> from PR? Maybe, maybe not. It's not obvious to me. A large majority of >> the >> backgammon community has somehow gone along with this way of doing things >> without thinking it through, or even recognizing that there is something >> to think about here. >> >> Somehow people have come to conceptualize a backgammon session as a >> sequence of quiz problems, where the only role of the denominator is the >> measure the length of the quiz, but in reality there can be correlations >> (or anti-correlations) between the *types* of decisions you're presented >> with in a game and the *number* of decisions in the game. By messing with >> the denominator in a funny way, PR produces some strange and >> hard-to-understand effects. ER has the advantage of keeping things >> simple: >> the denominator is just the number of rolls. That is the most obvious >> measure of length, and it has the advantage of being simple to understand. >> >> If GNU stops calculating Snowie ER, then it will be very difficult to >> extract this potentially illuminating and instructive statistic from a >> backgammon session. >> >> Tim >> >>
