I've sent this message to a few individuals. My apologies if you receive it individually and on the list.
Let me start with a conclusion, and then justify it later. Webster's and Hill's minimizations of the distance of states' s/q from 1, or from eachother, are incompatible with unbias. If you choose those Hill or Webster optimizations for single states or pairs of states, then you're also choosing bias. Maybe some believe that those 1 and 2 state s/q distance minimizations are more important than unbias. Hello? Wouild anyone want to try to justify intentionally systematically giving small states more s/q than large states? (or vice versa) Maybe a good starting definition of biass is: "That which, in PR, would give the smallest states incentive to coalesce, or give the largest states incentive to split, in order to maximize their s/q". When I found out that, with a flat state-size frequency distribution, Webster is slightly large-biased, I posted about it to EM. That's when I first proposed Bias-Free. I don't know if all of you are on EM, and so I feel that I should repeat that discussion in this letter. Say we graph Webster's s(q) step-function, with the range, the q scale, labled in quotas, and the vertical scale labled in seats. The 1-seat-per-quota line, the s=q line, of course rises at 45 degrees, from the origin. At first glance, Webster looks unbiased, because its step-function is perfectly symmetrical about the s=q line. You can't get any closer to 1 seat per quota than that, right? But there's a problem. Let me define a few terms: A cycle is the range between two integer values of q. For example, between 4 quotas and 5 quotas. A cycle's lower section is the part below the rounding point. Its upper sectioni is the part above the rounding point. Consider two corresponding points in a cycle's upper and lower sections--two points equidistant from the rounding point (which in Webster is in the middle of the cycle). Their seats differ equally and oppositely from what 1 seat per quota would give them. But, for the state in the lower section, that represents a greater loss of s/q, because q is lower. So the overall s/q in the cycle is less than 1. That problem is more pronounced for low-population cycles, because q differs by a greater factor in the lower and upper sections. If the state-size frequency distribution is flat, then the lower cycles have less overall s/q. One way to solve that: (Computer keyboards don't have "delta", and so I'm going to use "D" to stand for finite differences). Sum Ds/q over a cycle. Set it equal to zero, and solve for R, the rounding point. That gives Bias-Free. Bias-Free's rounding point, between the consecutive integers a and b, is ((b^b)/(a^a))(1/e). Bias-Free ensures that a cycle's overall s/q is 1 (or as close to it as possible). Cycle-Webster accomplishes the same thing by applying Webster to cycles instead of individual states. Looking at that Hill's s(q) step function graph, Hill departs blatantly asymmetrically from the s=q line, tending to be above it, moreso for the lower population cycles. Webster's and Hill's rounding points differ from those of Bias-Free. (Hill's differ by about twice as much as Webster's). Both the graphs and the differences in the rounding points tell that Webster and Hill are biased. Their s/q distance minimizations are incompatible with unbias, as I said earlier. When we speak of bias, isn't it understood that we're speaking of a tendency that is consistent in its direction (favoring larger or smaller states)? That we're speaking of something that has its effect even over greeat populatioin differences? The trouble is that, if we measure bias as the correlation between states' q and s/q, we aren't just looking at that long-range consistent trend. We're also including a different kind of bias, within the cycles, a bias that reverses itself wilthin each cycle. I'll call that "micro-bias". In any cycle, states above the rounding point have more s/q than states below the rounding point. But, just looking at states below the rounding poiont (or above it), s/q decreases with increasing q. So I suggest that that intra-cycle micro-bias is not what we mean by bias. It isn't even _part_ of what we mean by bias, because bias is a trend that is consistent in its direction over long ranges of population. Earlier I said that a good starting definition of bias is "That which, in PR, would give the smallest states incentive to coalesce, or give the largest states ilncentive to split, in order to maximize their s/q". Micro-bias doesn't do that. Bias that's consistent in its direction over long ranges of popoulation does that. Jefferson is strongly large-biased, but it's small-biased within each cycle. If you agree with that starting definition, then you agree that bias should be measured on the large scale, ignoring intra-cycle micro-bias. That cycles should be the smallest units looked at to measure bias. And that Cycle-Webster is the unbiased method. And Bias-Free, when the distribution is flat. I agree with Warren that, if we measure bias as _states'_ correlation of q and s/q, then, with micro-bias in the mix, yes it would be very difficult to say something theoretical about bias, and only empirical measurement can say anything. But if we leave out micro-bias, looking only at large-scale bias, on a scale no finer than cycles, then it becomes simpler, and theory can say a few things. A few things still remain to be found out by empirical testing, such as whether our census' state-size frequency distribition, tending to cause some large-bias, can save Hill from its small-bias. Well, every empirical result I've heard of says "No". Even with the distribution's large-bias, Hill is still much more biased than Webster. I suggest that bias-testsing should mean looking at the correlation of _cycles'_ average q and their average s/q. I suggest that, as long as we aren't calculating the probability of the correlation, the more sensitive Pearson correlation should be used. When looking at the correlation with respect to individual states, maybe Spearman's rank correlation, by ignoring some detail, might ignore some micro-bias, and that would be a good thing, suggesting that Spearman is right for correlation measured with respect to individual states. By the way, Cycle Webster can have two versionis. In one version, which I'll call "Hare Cycle-Webster", the cycles defined according to the states' Hare quotas remain the cycles that Webster is being applied to, thoughout the Webster process. So, since we're talking about Hare quotas, the cycles consitune to be the same as initially, and they contain the same states they initially did, and have the same total quotas as they initially did. Of course, when Webster is applied to the cycles, changing quotas are applied to give the right housle-size. The same iterative process that is used for ordinary Webster (and Hill and Jefferson, etc.) can of course then be used when Hare Cycle-Webster applies Webster to the cycles. The alternaative would be to make the cycles, and their states and their total quota, be based on the current quota being used in the Webster process. Much more work to handcount or program. Almost surely not necessary. So when I speak of Cycle-Webster, I mean Hare Cycle-Webster. No doubt the 2 versions could give different results. That doesn't mean that one is biased: Webster and Hamilton sometimes give different results, but they don't differ in their longterm bias. Hamilton is more random. Then, is one of the Cycle-Webster versions more random than the other? Maybe one steady and one random? Well all methods havre an unavoidable random component. The 2 Cycle-Webster versions could be equally random, and get different random results, to the extent that they're random. That's how I expect it is. Mike Ossipoff _________________________________________________________________ Communicate instantly! Use your Hotmail address to sign into Windows Live Messenger now. http://get.live.com/messenger/overview ---- election-methods mailing list - see http://electorama.com/em for list info
