[email protected] wrote:
Here's a suggestion for detecting clone sets based on Range Ballots:

Define the distance between two candidates as the square root of the
sum (over the ballots) of the squared diffference of their respective
ratings.

If the ballots are approval style, this becomes the square root of
the number of ballots on which just one of the two candidates is
approved.

Use these distances to do a cluster analysis of the candidates.

The tree structure of the clusters gives the tree structure of the
clone sets.

If you have a clone dependent method, like Copeland, that you would
like to "de-clone," you might find this clone set detectiion method
to be useful.

I wonder if that could be used to fix the problem of my Kemeny-based forced clustering method, which is that it degrades when the "clusters" (supporters of various candidates) overlap. It still beats STV (and even QPQ) according to my simulations, but improvement is improvement, no?

The problem is this: If you have 2 winners, and everybody votes bloc style for one of two "virtual parties", then the clustering method minimizes sum distance by allocating every voter to the right bloc.

However, if we add another seat and candidate, call the candidate A, and A is unanimously voted top, then we'd ideally want the method to pick A, and then pick the other seats as if A had not been in the running*. Plain clustering doesn't do that: instead, one of the seats will have A elected (as he deserves), but the minimization won't pick randomly from later preferences, so the set of ballots not allocated to that seat won't be representative of the ballots, had not A run.

One way to fix this is to use elimination, but that's hardly monotonic. I've been considering adding a hierarchy to the clustering, which is similar to what you're saying above, but I haven't quite found out how to do that yet.

Since my multiwinner method uses rankings, I use Kemeny distance to determine the distance between ballots and a "representative" ballot for a cluster. Then I simply find the orderings that minimize the total distance between ballots allocated to the clusters, and to the relevant cluster's representative ballot, through brute force (which is why the temporal complexity is so horrible).

In any case, even if your tree clustering idea can't be used for my method, perhaps the idea of clustering by distance to remove clones could be adapted to ranked methods by using the Kemeny distance between ballot orders.

-

* This is true only if the distribution of additional preferences for those who vote for A is equal to that of the voters in general, A notwithstanding. There might be additional constraints as well, which is why I'm using "unanimous A top" as an example - since removing A doesn't alter the other candidate distributions in any way.
----
Election-Methods mailing list - see http://electorama.com/em for list info

Reply via email to