Suppose that voters fill out questionnaires of twenty or more yes/no answers. What is a good way to calculate the “distance” between questionnaires? (Remember this is the key to getting an IIAC compliant voting system.)
The big problem is that some of the questions are apt to be clones of each other. Suppose for example that of the twenty questions on a questionnaire, the first fifteen were basically the same question in disguise. Then almost all of the voters who voted yes on the first question would answer yes on the next fourteen questions, which would make those fourteen questions not only redundant, but would also distort the perception of distance between questionnaires if you used any of the standard metrics (Hamming, Euclidean, etc.) on sets of vectors of zeroes and ones. First suggestion: Have each voter assign weights to the questions to reflect their relative importance to that voter. Then normalize the weights so that they add to 100. Then given two questionnaires q1 and q2, the semi- metric rho(q1,q2) is the sum of the q1’s weights on all of the questions that q1 and q2 disagree. This is a measure of how far the q1 voter thinks that the q2 voter differs from her on important questions. In this first suggestion the proposed metric is d(q1,q2)=rho(q1,q2)+rho(q2,q1). Second suggestion: 1. Create a binary tree with the questionnaires as the leaves and a subset of the questions as nodes as follows. The root node is the question on which the voters are most evenly balanced (break ties randomly). Each subsequent node X is the question on which the voters that answered correctly to arrive at that node are most evenly divided (breaking ties randomly). 2. Once all of the questionnaires have been classified as leaves on this binary tree. Assign to each question a weight equal to the probability that a random leaf has that question as an ancestor. 3. The distance between two questionnaires is the sum of the weights of the questions on which they differ. Any other good ideas? ---- Election-Methods mailing list - see http://electorama.com/em for list info
