Warren Smith wrote:
--seems to me, KM is correct; if the Smith Set has less than about 20
members (and if this also is true recursively upon removing it), then
it always will be feasible to find the Kemeny order.

There are two related criteria in play here. First is plain old Smith. If the Kemeny method passes Smith, then we know all Smith set members must rank above non-Smith-set members (and so on recursively). Kemeny does pass Smith. If Kemeny also passes ISDA, then we can outright remove all the candidates that aren't in the Smith set because they won't alter the outcome anyway. So if Kemeny passes ISDA, your "and this also is true recursively" property doesn't have to hold.

How often will a (>=20)-member Smith set arise in practice?

I do not know.   KM "can't imagine" it will happen, but that's his guess.
I think if you take a 100-candidate election with random-uniform
votes, it'd happen pretty often. Such elections have Condorcet winners
about 9-12% of the time (says my computer), which leads me to guess
that they have (>=20)-member Smith sets about 10% of the time.  See
also puzzles
25 and 26 here for some theorems about the closely related problem of
random "round robin tournaments" rather than random "elections":
   http://www.rangevoting.org/PuzzlePage.html

I think a ballot with significantly more than 20 members will at least be hard to fill out (and the Australian ballots tend to support this claim). Say each voter fills out the ranks of k candidates, where k is 1...i (random), i < 20 <= number of candidates. Does that decrease the probability of a >= 20 member Smith set?

But it could be argued/hoped that 100-candidate elections, when they
occur, will have vote distributions quite dissimilar to
random-uniform.
It could be counterargued that humanity's experience with
100-candidate elections is small, and with 100-candidate
rank-order-ballot
elections is even smaller, so there is little basis for KM's guess.

I'm basing my guess on what election data exists (the vast majority of which have singular Condorcet winners), but it is of course possible that the voters will vote in a way that leads to a greater Smith set - either because there are many candidates or because the candidates are shielded from vote-splitting strategy.

Tideman in his book has got a statistical semiempirical model, based
on real election statistics, of rank-order ballot elections.  His
model is described/corrected/recalculated here:
    http://rangevoting.org/TidemanElModel.html
and it finds the probability, in a 100-canddt election,that a
Condorcet Winner exists, is 27%, and for a 200-canddt
Tideman-statistics election this probability is 5.6%.

Then I suppose it should be possible to make a simulator to see how often it generates size k Smith sets. If only I had the time :-)

I think I solved a few instances of Warren's 27x27 matrix on [0...1
billion], but I can't be sure since I can't confirm that my Kemeny cost is
indeed the least possible. I am, however, assuming that it is, and so I'm
looking at larger sizes. Since these will be harder, I've thought of adding
an initialization step where the program tells the IP solver of a reasonably
low-cost guess. This could speed up the process,

--certainly it could.  If you use some nonrigorous heuristic to find a
good order, then you could then use the fact that the optimal order
must be at least that good, to help "prune" a subsequent exhaustive
search.
Lower bounds (i.e. in the other direction) can also be used for pruning
(if you have any).

If you have an integer program it can sometimes happen that the optimal
solution of the linear program (without constraint of having to have integers)
happens by luck to be all-integer.  In such a case, that proves it is optimal
for the integer program too.   However, in general integer programming
is NP-hard and such luck cannot be guaranteed.

Sometimes you can get partial such luck, e.g. you exhaustively try all
integer possibilities for the first two variables, and then you get
lucky as above about all the remaining variables in each case.   That
also would
find the optimum and yield an optimality proof.

If that is what happened for KM, then we could ask: what is the
"success rate," i.e. what percentage of the time do his integer
programs enjoy these
kinds of luck?

To the extent that branch-and-bound or branch-and-cut strategies can be considered to have luck, they get lucky to some degree on all problems, but they get more lucky on some than on others. If that wasn't the case and integer programming in n variables was as hard as brute force, Kemeny would be dead in the water.

As for linear programming being integer optimal or close enough that I don't notice any difference in solution speed, that *seems* to happen to me in the vast majority (again >90%, and I'd guess >95%, but I haven't tested) of the randomized instances on the distributions generated by my voting program. I do know your 27x27 matrix is not one of these instances, since the IP solver goes past the LP stage there, but it's still solvable. Perhaps better integer programming libraries could solve them faster - I use glpk. I don't know the details of branch depth, extent of pruning, and so on, though.

----
Election-Methods mailing list - see http://electorama.com/em for list info

Reply via email to