On 07/11/2012 08:16 PM, Michael Ossipoff wrote:
On Tue, Jul 10, 2012 at 6:17 AM, Kristofer Munsterhjelm
<[email protected] <mailto:[email protected]>> wrote:
On 07/09/2012 06:33 AM, Michael Ossipoff wrote:
What about finding, by trial and error, the
allocation that minimizes the calculated correlation measure.
Say, the
Pearson correlation, for example. Find by trial and error the
allocation
with the lowest Pearson correlation between q and s/q.
For the goal of getting the best allocation each time (as opposed to
overall time-averaged equality of s/q), might that correlation
optimization be best?
Sure, you could empirically optimize the method. If you want
population-pair monotonicity, then your task becomes much easier:
only divisor methods can have it
If unbias in each allocation is all-important, then can anything else be
as good as trial-and-error minimization of the measured correlation
between q and s/q, for each allocation?
You answered this below. If you know the distribution, then you can
directly find out what kind of rounding rule would be best and you'd use
that one.
That is, unless you mean something different, that: "if the only thing
you care about is correlation, then wouldn't limiting yourself to
divisor methods be a bad thing?". Well, perhaps a non-divisor method
would be closer to unbias, but divisor methods are close enough and you
don't get weird paradoxes. For practical purposes, as you've said, even
Sainte-Laguë/Webster is good enough.
so you just have to find the right parameter for the generalized
divisor method:
f(x,g) = floor(x + g(x))
where g(x) is within [0...1] for all x, and one then finds a divisor
so that x_1 = voter share for state 1 / divisor, so that sum over
all states is equal to the number of seats.
[unquote]
Yes, that's a divisor method, and its unbias depends on whether or not
the probability density distribution approximation on which it's based
is accurate. For Webster, it's known to be a simplification. For
Weighted-Websster (WW), it's known to be only a guess.
It does seem to be pretty unbiased by Warren's computer simulations,
slightly better than ordinary Webster. See
http://www.RangeVoting.org/BishopSim.html . But of course, you can't
entirely exclude the presence of bugs.
You said:
We may further restrict ourselves to a "somewhat" generalized divisor
method:
f(x, p) = floor(x + p).
For Webster, p = 0.5. Warren said p = 0.495 or so would optimize in the
US (and it might, I haven't read his reasoning in detail).
[endquuote]
Yes, Warren said that if the probability distsribution is exponential,
then that results in a constant p in your formula. He used one
exponential function for the whole range of states and their
populations, determined based on the total numbers of states and seats.
But that's a detail that isn't important unless you've actually decided
to use WW, and to use Warren's one overall exponential distribution.
After I'd proposed WW, Warren suggested the one exponential probability
distribution for the whole range of state populations, and that was his
version of WW.
The aforementioned page also shows Warren's modified Webster to be
better than Webster, and have very low bias, both for states with
exponentially distributed populations and for states with uniformly
distributed populations.
You said:
Also, I think that the bias is monotone with respect to p. At one end
you have
f(x) = floor(x + 0) = floor(x)
which is Jefferson's method (D'Hondt) and greatly favors large states.
At the other, you have
f(x) = floor(x + 1) = ceil(x)
which is Adams's method and greatly favors small states.
If f(x, p) is monotone with respect to bias as p is varied, then you
could use any number of root-finding algorithms to find the p that sets
bias to zero, assuming your bias measure is continuous. Even if it's not
continuous, you could find p so that decreasing p just a little leads
your bias measure to report large-state favoritism and increasing p just
a little leads your bias measure to report small-state favoritism.
[endquote]
You're referring to trial and error algorithms. You mean find, by trial
and error, the p that will always give the lowest correlation between q
and s/q? For there to be such a constant, p, you have to already know
that the probability distribution is exponential (because, it seems to
me, that was the assumption that Warren said results in a constant p for
an unbiased formula).
Then why is the modified Webster so good on uniformly distributed
populations?
If you know that it's exponential, you could find
out p without trial and error, by analytically finding the rounding
point for which the expected s/q is the same in each interval between
two consecutive integers, given some assumed probability distribution
(exponential, because that's what Warren said results in a constant p).
As I was saying before, it's solvable if the distribution-approximating
function is analytically antidifferentiable, as is the case for an
exponential or polynomial approximating function.
You might say that it could turn out that solving for R, the rounding
point, requires a trial-and-error equation-solving algorithm. I don' t
think it would, because R only occurs at one place in the expression. We
had analytical solutions.
But, as I was saying, you only know that WW is unbiased to the extent
that you know that your distribution-approximating function is accurate.
I felt that interpolation with a few cumulative-state-number(population)
data points, or least-squares with more data points, would be better.
Warren preferred finding one exponential function to cover the entire
range of state populations, based the total numbers of states and seats.
I guess that trying all 3 ways would show which can give the lowest
correlations between q and s/q.
It seems that even if you don't know anything about distributions, you
could gather past data and optimize p just by educated trial and error.
Actually, I think one could go further. Consider a method that tries
every possible allocation to find the one with least bias. Then you know
what your assembly "should" look like. Now you can use a divisor method
and some inference or black-box search algorithms to find out what g(x)
"should" look like for various x. Repeat for different assemblies to get
a better idea of the shape of g(x), then fit a function to it and there
you go.
But in practice, I agree with you. Sainte-Laguë is good enough. If you
really want better, try all the different suggestions on Warren's page
and one of them most likely will be good enough. If none are, *then* one
may consider more exotic approaches.
----
Election-Methods mailing list - see http://electorama.com/em for list info