On 07/15/2012 11:47 AM, Michael Ossipoff wrote:
If unbias in each allocation is all-important, then can anything else be
as good as trial-and-error minimization of the measured correlation
between q and s/q, for each allocation?


You answered this below. If you know the distribution, then you can directly
find out what kind of rounding rule would be best and you'd use that one.

Yes, but that's a big "if".  But if you, by trial and error, in each
allocation, minimize the measured correlation between q and s/q, then
you're achieving unbias, and you needn't know or assume anything about
that distribution.

Besides, Pearson correlation is well-known, and WW is new. And
minimizing correlation is obvious and easily explained.  Of course
you're losing the minimization of the deviation of  states' s/q from
its ideal equal value.

On the other hand, if an one exponential function, over the entire
range of states, is a good approximation, then we have a constant p.
And that p that is just very slightly less than .5 wouldn't be so hard
to get acceptance for, if it's explained that it gets rid of Webster's
tiny bias of about 1/3 of one percent.   ...to better attain more true
unbias.

So either approach would be proposable, if that one overall
exponential is a good approximation. But Warren himself admitted that,
at the low-population end, it isn't accurate, because the states, at
some point, stop getting smaller. But Warren said that his single
exponential function worked pretty well in his tests.

Gibrat's law ( https://en.wikipedia.org/wiki/Gibrat%27s_law ) suggests a log-normal distribution, and indeed the tail of such a distribution is exponential.

After reading about it, I did some tests with US state populations (of latest census), and a kernel density estimate of the logarithms of the populations show a distribution that looks a lot like a sum of Gaussians. So log-normal is a reasonable first approximation. To get a better approximation, use the exponential of a fitted sum of Gaussians, which would be log-normals multiplied -- I think.

I haven't tested on past populations, so perhaps my sample size is insufficient. Still, if it is not, that would explain why Warren's exponential function worked well -- and presumably, a log-normal fit would work better yet.

----
Election-Methods mailing list - see http://electorama.com/em for list info

Reply via email to