[EM] Improving the Sainte-Laguë index

Kristofer Munsterhjelm Wed, 11 Sep 2013 00:56:29 -0700

The Sainte-Laguë index is optimized by the Sainte-Laguë method. It is:


SUM over all parties p: (V_p - S_p)^2 / V_p

where V_p is the fraction of votes for a party, and S_p is the fractionof seats. However, the score can range to infinity, so it's not clearwhat it measures. Other indices measure disproportionality in percentand so can't go beyond 100%.

But the Sainte-Laguë index looks very similar to the chi-square valuefor goodness of fit:


x^2 = SUM over all entries x: (O_x - E_x)^2 / E_x

where O_x is observed and E_x is expected. Note that since (x-y)^2 =(y-x)^2 this is equivalent to considering fraction of seats as O_x(observed) and fraction of votes as expected (E_x). In other words, aperfectly PR assembly would give exactly the same fraction of seats toparty P as the voters gave party P votes.

What does the Sainte-Laguë index measure? It gives a value on achi-square distribution according to how likely the assembly is to havebeen drawn in an unbiased manner with respect to the vote fractions,were the drawing random.

But the statistic itself usually isn't of interest. So that suggeststhat one reverses the x^2, i.e. the Sainte-Laguë index, to get ap-value. And that p-value *can* be interpreted, and does measuresomething useful.

At least for large assemblies, drawing an assembly at random would oftengive representative results, and some times unrepresentative ones. Whenthe assembly is unrepresentative, it is unlike what you would expect tosee when the assembly is drawn at random. Thus, if the assembly istypical of something you would see at random, it is representative. Thevalue of a PR method, according to that interpretation, would lie inalways getting a representative assembly instead of getting one thatusually, but not always, is representative.

So in order to understand what the Sainte-Laguë index says, it appearswe should consider it as the result of a chi-square test and infer ap-value from it.

The x^2 has limitations. It may err when there are few seats, or whenthere are very many parties with little support each. But since we knowwhat we're looking for (a p-value of goodness of fit), we can insteaduse something that provides it even in those cases: the G-test when theexpected (fraction of votes) numbers are low, and an exact multinomial(binomial in a two-party case) test when there are few seats.

We can in any case use the G-test instead of the chi-square since (to myknowledge) the former is strictly closer to the multinomial test than isthe latter. So an improved Sainte-Laguë index looks like


ISLI = 2 * SUM over all parties p: S_p * ln(S_p / V_p),

and will return the same thing the original Sainte-Laguë index does: avalue along the chi-square distribution. These values can be turned intop-values by means of a chi-square distribution function with n-1 degreesof freedom, where n is the number of parties.

Finally: the index (and the improved index) measures accuracy orgoodness of fit with respect to support by the voters. Since we use thesame fraction for voter support no matter the number of seats, the mostaccurate method would be house monotone. I've already shown instanceswhere house monotonicity is not desirable, so in some sense, one couldsay the index measures accuracy of the wrong thing, at least when thereare few seats.

A way of getting around this is to ask the method to optimize accuracyof something that is closer to what we want. But "what we want" may notbe directly accessible. It's a relative quantity: "voter X is betterrepresented by Y than by Z". And so, finding out how to do that in asgood as possible a way is still open to research. The Sainte-Laguë (andmodified SLI) might give a good asymptotic result though (as number ofseats approach number of voters).

(In my disproportionality measurement program for individual candidatemultiwinner elections, I sidestepped this problem by giving each voter,and candidate, hidden yes/no opinions. The voters would rank thecandidates so those closer in opinion to themselves came first, and thenthe disproportionality was determined based on the distribution ofopinions, not candidates, in the assembly and among the people.)

----
Election-Methods mailing list - see http://electorama.com/em for list info

[EM] Improving the Sainte-Laguë index

Reply via email to