Re: [EM] Dectecting Clone Sets

Kristofer Munsterhjelm Mon, 16 Nov 2009 23:42:13 -0800

[email protected] wrote:

Here's a suggestion for detecting clone sets based on Range Ballots:


Define the distance between two candidates as the square root of the
sum (over the ballots) of the squared diffference of their respective
ratings.

If the ballots are approval style, this becomes the square root of
the number of ballots on which just one of the two candidates is
approved.

Use these distances to do a cluster analysis of the candidates.

The tree structure of the clusters gives the tree structure of the
clone sets.

If you have a clone dependent method, like Copeland, that you would
like to "de-clone," you might find this clone set detectiion method
to be useful.

I wonder if that could be used to fix the problem of my Kemeny-basedforced clustering method, which is that it degrades when the "clusters"(supporters of various candidates) overlap. It still beats STV (and evenQPQ) according to my simulations, but improvement is improvement, no?

The problem is this: If you have 2 winners, and everybody votes blocstyle for one of two "virtual parties", then the clustering methodminimizes sum distance by allocating every voter to the right bloc.

However, if we add another seat and candidate, call the candidate A, andA is unanimously voted top, then we'd ideally want the method to pick A,and then pick the other seats as if A had not been in the running*.Plain clustering doesn't do that: instead, one of the seats will have Aelected (as he deserves), but the minimization won't pick randomly fromlater preferences, so the set of ballots not allocated to that seatwon't be representative of the ballots, had not A run.

One way to fix this is to use elimination, but that's hardly monotonic.I've been considering adding a hierarchy to the clustering, which issimilar to what you're saying above, but I haven't quite found out howto do that yet.

Since my multiwinner method uses rankings, I use Kemeny distance todetermine the distance between ballots and a "representative" ballot fora cluster. Then I simply find the orderings that minimize the totaldistance between ballots allocated to the clusters, and to the relevantcluster's representative ballot, through brute force (which is why thetemporal complexity is so horrible).

In any case, even if your tree clustering idea can't be used for mymethod, perhaps the idea of clustering by distance to remove clonescould be adapted to ranked methods by using the Kemeny distance betweenballot orders.

* This is true only if the distribution of additional preferences forthose who vote for A is equal to that of the voters in general, Anotwithstanding. There might be additional constraints as well, which iswhy I'm using "unanimous A top" as an example - since removing A doesn'talter the other candidate distributions in any way.

----
Election-Methods mailing list - see http://electorama.com/em for list info

Re: [EM] Dectecting Clone Sets

Reply via email to